Abstract
This paper focuses on computing a nearly optimal penalty in the method of empirical risk minimization. It is assumed that we have at our disposal the noisy data Y = θ + σξ, where \({\theta\in \mathbb{R}^n}\) is an unknown vector and \({\xi\in \mathbb{R}^n}\) is a standard white Gaussian noise. It is also assumed that the underling vector θ is sparse, and therefore to recover θ we use a hard thresholding estimate \({\hat\theta_i(Y,t)=Y_i{\bf 1}\{|Y_i|\ge t\}}\). In order to adapt to an unknown sparsity of θ, the threshold t is assumed to be data-driven. The very popular approach for computing such thresholds is based on the principle of empirical risk minimization suggesting the following data-driven threshold \({\hat t =\text{arg\,min}_t\{\|Y-\hat\theta(Y,t)\|^2+Pen(Y,t)\}}\), where Pen(Y, t) is a penalty function. In this paper, it is proved with the help of a sharp oracle inequality that the main term in the optimal penalty is given by 2σ 2#{i : |Y i | ≥ t} log[n/#{i : |Y i | ≥ t}].
References
Abramovich F., Benjamini Y., Donoho D., Johstone I.: Adapting to unknown sparsity by controlling false discovery rate. Ann. Statist. 34, 584–653 (2006)
Abramovich F., Grinshtein V., Pensky M.: On optimality of Bayesian testimation in the normal means problem. Ann. Statist. 35(5), 2261–2286 (2007)
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, P.N., Csaki, F. (eds.) Proceedings of the 2nd International Symposium on Information Theory, Budapest, pp. 267–281 (1973)
Birgé L., Massart P.: Minimal penalties for Gaussian model selection. Probab. Theory Relat. Fields 138, 33–73 (2007)
Cavalier L., Golubev Yu.: Risk hull method and regularization by projections of ill-posed inverse problems. Ann. Stat. 34(4), 1653–1677 (2006)
Donoho D., Johnstone I.: Ideal spatial adaptation by wavelet shrinkage. Biometrica 81, 425–455 (1994)
Donoho D., Johnstone I., Kerkyacharian G., Picard D.: Wavelet shrinkage: asymtopija?. J. R. Stat. Soc. Ser. B 57, 301–369 (1995)
Foster D., George E.: The risk inflation criterion for multiple regression. Ann. Statist. 22, 1947–1975 (1994)
Kneip A.: Ordered linear smoothers. Ann. Statist. 22, 835–866 (1994)
Tibshirani R., Knight K.: The covariance in ation criterion for adaptive model selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 61, 529–546 (1999)
Van der Vaart, A., Wellner, J.: Weak Convergence and Empirical Processes, Springer, pp. 508, New York (1996)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yuri, G. On oracle inequalities related to data-driven hard thresholding. Probab. Theory Relat. Fields 150, 435–469 (2011). https://doi.org/10.1007/s00440-010-0280-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-010-0280-0
Keywords
- Hard thresholding
- Empirical risk
- Penalization
- Oracle inequality
Mathematics Subject Classification (2000)
- 62G05