Abstract
Aiming to estimate extreme precipitation forecast quantiles, we propose a nonparametric regression model that features a constant extreme value index. Using local linear quantile regression and an extrapolation technique from extreme value theory, we develop an estimator for conditional quantiles corresponding to extreme high probability levels. We establish uniform consistency and asymptotic normality of the estimators. In a simulation study, we examine the performance of our estimator on finite samples in comparison with a method assuming linear quantiles. On a precipitation data set in the Netherlands, these estimators have greater predictive skill compared to the upper member of ensemble forecasts provided by a numerical weather prediction model.
References
Beirlant, J., Wet, T.D., Goegebeur, Y.: Nonparametric estimation of extreme conditional quantiles. J. Stat. Comput. Simul. 74(8), 567–580 (2004)
Bentzien, S., Friederichs, P.: Generating and calibrating probabilistic quantitative precipitation forecasts from the high-resolution nwp model cosmo-de. Weather Forecast. 27(4), 988–1002 (2012)
Bentzien, S., Friederichs, P.: Decomposition and graphical portrayal of the quantile score. Q. J. Roy. Meteorol. Soc. 140(683), 1924–1934 (2014)
Buishand, T., de Haan, L., Zhou, C.: On spatial extremes: with application to a rainfall problem. Ann. Appl. Stat. 2, 624–642 (2008)
Coles, S., Tawn, J.: Modelling extremes of the areal rainfall process. J. R. Stat. Soc. Ser. B. 58, 329–347 (1996)
Daouia, A., Gardes, L., Girard, S.: On kernel smoothing for extremal quantile regression. Bernoulli 19(5B), 2557–2589 (2013)
Daouia, A., Gardes, L., Girard, S., Lekina, A.: Kernel estimators of extreme level curves. Test 20(2), 311–333 (2011)
Davison, A.C., Smith, R.L.: Models for exceedances over high thresholds. J. R. Stat. Soc. Ser. B Methodol. 52(3), 393–442 (1990)
De Haan, L., Ferreira, A.: Extreme Value Theory: an Introduction. Springer, Berlin (2007)
Gardes, L., Girard, S.: Conditional extremes from heavy-tailed distributions: an application to the estimation of extreme rainfall return levels. Extremes 13(2), 177–204 (2010)
Gardes, L., Girard, S., Lekina, A.: Functional nonparametric estimation of conditional extreme quantiles. J. Multivar. Anal. 101(2), 419–433 (2010)
Gardes, L., Stupfler, G.: An integrated functional Weissman estimator for conditional extreme quantiles. REVSTAT - Statistical Journal 17(1), 109–144 (2019)
Goegebeur, Y., Guillou, A., Osmann, M.: A local moment type estimator for the extreme value index in regression with random covariates. Can. J. Stat. 42 (3), 487–507 (2014)
Kalnay, E.: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, Cambridge (2003)
Koenker, R.: Quantile Regression. Cambridge University Press, Cambridge (2005)
Kong, E., Linton, O., Xia, Y.: Uniform Bahadur representation for local polynomial estimates of M-regression and its application to the additive model. Econ. Theory 26(5), 1529–1564 (2010)
Martins-Filho, C., Yao, F., Torero, M.: Nonparametric estimation of conditional value-at-risk and expected shortfall based on extreme value theory. Econ. Theory 34(1), 23–67 (2018)
Rényi, A.: On the theory of order statistics. Acta Mathematica Academiae Scientiarum Hungarica 4(3-4), 191–231 (1953)
Wang, H.J., Li, D.: Estimation of extreme conditional quantiles through power transformation. J. Am. Stat. Assoc. 108(503), 1062–1074 (2013)
Wang, H.J., Li, D., He, X.: Estimation of high conditional quantiles for heavy-tailed distributions. J. Am. Stat. Assoc. 107(500), 1453–1464 (2012)
Wilks, D.S.: Statistical methods in the atmospheric sciences. Academic Press, New York (2011)
Yu, K., Jones, M.: Local linear quantile regression. J. Am. Stat. Assoc. 93 (441), 228–237 (1998)
Acknowledgements
The authors would like to sincerely thank the two referees and the associate editor for the constructive comments which led to a substantial improvement of this paper. This work is part of the research project “Probabilistic forecasts of extreme weather utilizing advanced methods from extreme value theory” with project number 14612 which is financed by the Netherlands Organisation for Scientific Research (NWO).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs
Appendix: Proofs
This section contains the proofs of Theorems 1–3 in Section 3. Throughout this section, c,c1,c2,… denote positive constants, which are not necessarily the same at each occurrence.
1.1 Proof of Theorem 1
The uniform consistency of \(\hat r\) relies heavily on the uniform Bahadur representation for \(\hat r\). We make use of the Bahadur representation obtained in Kong et al. (2010).
Let ψτ(u) = τ − I(u < 0), that is the right derivative of ρτ at u. Then by Corollary 3.3 and Proposition 1 in Kong et al. (2010), we have
where Cn,i(x) is a Lipschitz continuous function and thus absolutely bounded in [a,b]. Define
Then, the triangle inequality leads to
The last equality follows from the fact that r″ is uniformly bounded by Assumption A1.
Next, we show that, there exists a \(\delta _{C}\in (0, \frac {1}{2}-\delta _{h})\) such that
Define \( T_{i}(x):= h_{n} K\left (\frac {X_{i}-x}{h_{n}}\right )C_{n, i}(x)\). Then for any x,y ∈ [a,b], by the triangle inequality and the Lipschitz continuity of K, we have
Note that the constant c does not depend on i, that is, the Lipschitz continuity is uniform in i for all Ti’s. Consequently, it follows from that |ψτ(u)| ≤ 1 that,
Let \(M_{n} = n^{\delta _{C}+2\delta _{h}}\log n\) and {Ii = (ti,ti+ 1],i = 1,…,Mn} be a partition of (a,b], where \(t_{i+1} - t_{i} = \frac {b-a}{M_{n}}\). Then for t ∈ Ii,
or equivalently,
Therefore, for n sufficiently large,
where the third inequality is due to that \(\frac {c(b-a)}{M_{n} {h_{n}^{2}}}<\frac {1}{2}n^{-\delta _{C}}\) for n sufficiently large. Next, we apply Hoeffding’s inequality to bound Pi. Define
For each i and n, {Wn,i,j, 1 ≤ j ≤ n} is a sequence of i.i.d. random variables. And with probability one, |Wn,j,i|≤ sup− 1≤u≤ 1K(u) supa≤x≤bCn,i(x) =: c3. Moreover, \(\mathbb {E}\left (W_{n,j,i} \right )=0\) because \(\mathbb {E}(\psi _{\tau _{c}}(\epsilon _{j}))=0\) and Xj and 𝜖j are independent. Thus, by Hoeffding’s inequality,
Note that 1 − 2δh − 2δC > 0 by the choice of δC. Thus, for n →∞,
Hence, Eq. 2 is proved. Now by choosing δ = δC, we obtain via (1) that,
due to that \(\delta _{h}\in (\frac {1}{5}, \frac {1}{2})\) and \(\delta _{C}<\frac {1}{2}-\delta _{h}\).
1.2 Proof of Theorem 2
The proof follows a similar line of reasoning as that of Theorem 2.1 in Wang et al. (2012). The uniform consistency of \(\hat r_{n}\) given in Theorem 1 plays a crucial role. Define \(V_{n} := ||\hat {r}_{n}-r||_{\infty } = o_{p}\left (n^{-\delta }\right )\).
Let Ui = FY |X(Yi|Xi) for all 1 ≤ i ≤ n. Then {Ui,i = 1,…,n} constitute i.i.d. random variables from a standard uniform distribution. Recall the definition of ei:
Thus, the ordering of {ei,i = 1,…,n} is not necessarily the same as the ordering of {Ui,i = 1,…,n}. The main task of this proof is to show that the kn largest ei’s correspond to the kn largest Ui’s; see Eq. 4. To this aim, we first prove that with probability tending to one, en−j,n for j = 0,…,kn can be decomposed as follows,
where i(j) is the index function defined as ei(j) = en−j,n. In view of Eq. ??, it is sufficient to prove that with probability tending to one, Ui(j) > τc jointly for all j = 0,…,kn. Define another index function, \(\tilde {i}(j)\) by \(U_{\tilde {i}(j)} = U_{n-j,n}\). Then it follows for n large enough,
where the second equality follows from that QY |X(τc|Xi(j)) = r(Xi(j)) and the last equality follows from (??) and the fact that \(U_{n-k_{n},n} > \tau _{c}\) for n large enough. Then, \(\lim _{n\to \infty } \mathrm {P} \left (\cup _{j=0}^{k_{n}} \{U_{i(j)} < \tau _{c} \}\right ) = 0\) follows from \(Q_{\epsilon }(U_{n-k_{n},n}) \to \infty \) and Vn = op(1) as n →∞. Hence, Eq. 3 is proved.
Next, we show that
that is the ordering of k largest residuals is determined by the ordering of Ui’s. In view of Eq. 3, it is sufficient to show that with probability tending to one,
By the second order condition given in Eq. ?? and Theorem 2.3.9 in De Haan and Ferreira (2007), for any small δ1,δ2 > 0, and n large enough,
for i = 1,…,kn, where \(W_{i}=\frac {1-U_{n-i,n}}{1-U_{n-i+1,n}}\) and \(\lim _{t\rightarrow \infty }A_{0}(t)/A(t)=1\). Observe that \(\log W_{i}= \log \frac {1}{1-U_{n-i+1,n}}-\log \frac {1}{1-U_{n-i,n}} \overset {d}{=}E_{n-i+1,n}-E_{n-i,n}\) with Ei’s i.i.d. standard exponential variables. Thus, by Rènyi’s representation (Rényi 1953), we have
From Proposition 2.4.9 in De Haan and Ferreira (2007), we have \(\frac {U_{n-k_{n},n}}{1-\frac {k_{n}}{n}}\overset {P}{\rightarrow }1\), which implies that \({A_{0}\left (\frac {1}{1-U_{n-k_{n},n}}\right )}=O_{p}\left ({A_{0}\left (\frac {n}{k_{n}}\right )}\right )\). Using the fact that A0 is regularly varying with index \(\mathcal {\rho }\), hence |A0| is ultimately decreasing, we obtain for n sufficiently large and any i = 1,…,kn,
by the assumption \(\sqrt {k_{n}}A\left (\frac {n}{k_{n}}\right )\rightarrow \lambda \).
For a sufficiently large u and any kn ≥ 1,
which tends to one as u →∞. This implies that
Thus, combining Eqs. 6, 7 and 8, we have
where the third equality follows from that \(\min _{1\leq i\leq k_{n}}\frac {E_{i}}{i}\overset {d}{=}E_{1,k}\overset {d}{=}\frac {E_{1}}{k}\) by Rènyi’s representation. Thus, we obtain that
Thus, Eq. 5 is proved by the assumption \(k_{n}^{-1}\left (\frac {n}{k_{n}}\right )^{\gamma }>>n^{-\delta }\) and \(\max _{1 \leq i \leq k_{n}}|r(X_{i(j)})-\hat r_{n}(X_{i(j)}|\leq 2 V_{n}=o_{p}\left (n^{-\delta }\right )\). Intuitively, Eq. 5 means that the difference between two successive upper order statistics of 𝜖 is larger than the error made in the estimation of r(x).
As aforementioned, Eqs. 3 and 5 together lead to Eq. 4, which further implies that with probability tending to one,
By the definition of \(\hat {\gamma }_{n}\) and Eq. 9, we can write the estimator as follows,
The first part is the well known Hill estimator and we have by Theorem 3.2.5 in De Haan and Ferreira (2007),
Therefore we can conclude,
by the assumption that \(k_{n}^{\gamma +1}n^{-\gamma -\delta }\rightarrow 0\).
We remark that the proof for Theorem 2.1 in Wang et al. (2012) isn’t completely rigorous, namely, the proof for (S.1) in the supplementary material of that paper is not right. We fix the problem while proving (9), which is an analogue to (S.1).
1.3 Proof of Theorem 4
Before we proceed with the proof of Theorem 3, we state the asymptotic normality of \(\hat {Q}_{\epsilon }(\tau _{n})\) defined in Eq. ?? in the theorem below.
Theorem 1
Let the conditions of Theorem 2 be satisfied. Assumenpn = o(kn) and\(\log (np_{n}) = o(\sqrt {k_{n}})\),then, asn →∞,
Theorem 4 can be proved in the same way as that for Theorem 2 in Wang et al. (2012). For the sake of completeness, we present the proof in this section.
Recall that \(\hat {Q}_{\epsilon }(\tau _{n}) = \left (\frac {k_{n}}{np_{n}} \right )^{\hat {\gamma }_{n}} e_{n-k_{n},n}=:d_{n}^{\hat {\gamma }_{n}} e_{n-k_{n},n}\). First, note that from Theorem 2, we have \(\sqrt {k_{n}}(\hat {\gamma }_{n}-\gamma )=\Gamma +o_{p}(1)\), where Γ is a random variable from \(N\left (\frac {\lambda }{1-\rho },\gamma ^{2}\right )\). Therefore,
where the last step follows from the assumption that \(\frac {\log d_{n}}{\sqrt {k_{n}}} \to 0\). Second, by Theorem 2.4.1,
In combination with Eq. 9, we have
by the assumption that \(k_{n}^{\gamma +1}n^{-\gamma -\delta }\rightarrow 0\). Last, by the second order condition given in Eq. 7 and Theorem 2.3.9 in De Haan and Ferreira (2007),
Finally, combing (22), (23) and (24), we have
by the assumption that dn →∞. Thus, Eq. 21 follows immediately.
1.4 Proof of Theorem 3
By definition of \(\hat {Q}_{Y|X}(\tau _{n}|x)\) and Theorem 1, we have,
Thus it follows from Theorem 4 and the assumption \(\frac {\sqrt {k}p_{n}^{\gamma }}{n^{\delta }\log \left (\frac {k_{n}}{np_{n}}\right )} \to 0\) that
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Velthoen, J., Cai, JJ., Jongbloed, G. et al. Improving precipitation forecasts using extreme quantile regression. Extremes 22, 599–622 (2019). https://doi.org/10.1007/s10687-019-00355-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10687-019-00355-1
Keywords
- Asymptotics
- Extreme conditional quantile
- Extreme precipitation
- Forecast skill
- Local linear quantile regression
- Statistical post-processing