Abstract
We observe a random measure N and aim at estimating its intensity s. This statistical framework allows to deal simultaneously with the problems of estimating a density, the marginals of a multivariate distribution, the mean of a random vector with nonnegative components and the intensity of a Poisson process. Our estimation strategy is based on estimator selection. Given a family of estimators of s based on the observation of N, we propose a selection rule, based on N as well, in view of selecting among these. Little assumption is made on the collection of estimators and their dependency with respect to the observation N need not be known. The procedure offers the possibility to deal with various problems among which model selection, convex aggregation and construction of T-estimators as studied recently in Birgé (Ann Inst H Poincaré Probab Stat 42(3):273–325, 2006). For illustration, we shall consider the problems of estimation, complete variable selection and selection among linear estimators in possibly non-Gaussian regression settings.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Antoniadis A., Besbeas P., Sapatinas T.: Wavelet shrinkage for natural exponential families with cubic variance functions. Sankhyā Ser. A 63(3), 309–327 (2001) Special issue on wavelets
Antoniadis A., Sapatinas T.: Wavelet shrinkage for natural exponential families with quadratic variance functions. Biometrika 88(3), 805–820 (2001)
Arlot, S.: Rééchantillonnage et Sélection de modèles. PhD thesis, University Paris XI (2007)
Arlot S.: Model selection by resampling penalization. Electron. J. Stat. 3, 557–624 (2009)
Arlot, S., Bach, F.: Data-driven calibration of linear estimators with minimal penalties. Technical report, HAL : hal-00414774, version 1 (2009)
Assouad P.: Deux remarques sur l’estimation. C. R. Acad. Sci. Paris Sér. I Math. 296(23), 1021–1024 (1983)
Baraud Y.: Model selection for regression on a fixed design. Probab. Theory Relat. Fields 117(4), 467–493 (2000)
Baraud Y., Birgé L.: Estimating the intensity of a random measure by histogram type estimators. Probab. Theory Relat. Fields 143(1–2), 239–284 (2009)
Baraud Y., Giraud C., Huet S.: Gaussian model selection with an unknown variance. Ann. Stat. 37(2), 630–672 (2009)
Barron A., Birgé L., Massart P.: Risk bounds for model selection via penalization. Probab. Theory Relat. Fields 113(3), 301–413 (1999)
Barron A.R., Cover T.M.: Minimum complexity density estimation. IEEE Trans. Inform. Theory 37(4), 1034–1054 (1991)
Bickel P.J., Ritov Y., Tsybakov A.B.: Simultaneous analysis of lasso and Dantzig selector. Ann. Stat. 37(4), 1705–1732 (2009)
Birgé L.: Approximation dans les espaces métriques et théorie de l’estimation. Z. Wahrsch. Verw. Gebiete 65(2), 181–237 (1983)
Birgé L.: Stabilité et instabilité du risque minimax pour des variables indépendantes équidistribuées. Ann. Inst. H. Poincaré Probab. Stat. 20(3), 201–223 (1984a)
Birgé L.: Sur un théorème de minimax et son application aux tests. Probab. Math. Stat. 3(2), 259–282 (1984b)
Birgé L.: Model selection via testing: an alternative to (penalized) maximum likelihood estimators. Ann. Inst. H. Poincaré Probab. Stat. 42(3), 273–325 (2006)
Birgé, L.: Model selection for Poisson processes. In: Cator, E., Jongbloed, G., Kraaikamp, C., Lopuhaä, R. Wellner, J. (eds.) Asymptotics: particles, processes and inverse problems, Festschrift for Piet Groeneboom, vol. 55, pp. 32–64. IMS Lecture Notes—Monograph Series (2007)
Birgé, L.: Model selection for density estimation with L2-loss. Technical report, arXiv:0808.1416 (2008)
Birgé L., Massart P.: An adaptive compression algorithm in Besov spaces. Constr. Approx. 16(1), 1–36 (2000)
Birgé L., Massart P.: Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3(3), 203–268 (2001)
Borovkov, A.A.: Mathematical Statistics. Gordon and Breach, Amsterdam. Translated from the Russian by A. Moullagaliev and revised by the author (1998)
Bunea F., Tsybakov A.B., Wegkamp M.H.: Aggregation for Gaussian regression. Ann. Stat. 35(4), 1674–1697 (2007)
Candès E., Tao T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)
Castellan, G.: Density estimation via exponential model selection. Technical report, 00.25 Université Paris XI, Orsay (2000)
Castellan G.: Sélection d’histogrammes à l’aide d’un critère de type akaike. C.R.A.S. 330, 729–732 (2000)
Catoni, O.: Statistical learning theory and stochastic optimization. In: Lecture notes from the 31st Summer School on Probability Theory held in Saint-Flour, July 8–25, 2001. Springer, Berlin (2004)
Celisse, A.: Model selection via cross-validation in density estimation, regression, and change-points detection. PhD thesis, University Paris XI (2008)
Dalalyan A.S., Tsybakov A.B.: Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity. Mach. Learning 72(1–2), 39–61 (2008)
DeVore R., Lorentz G.: Constructive Approximation. Springer, Berlin (1993)
Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recognit. Lett. (2010, to appear)
Giraud C.: Mixing least-squares estimators when the variance is unknown. Bernoulli 14(4), 1089–1107 (2008)
Goldenshluger A.: A universal procedure for aggregating estimators. Ann. Stat. 37(1), 542–568 (2009)
Goldenshluger A., Lepski O.: Structural adaptation via \({\mathbb L_p}\) -norm oracle inequalities. Probab. Theory Relat. Fields 143(1–2), 41–71 (2009)
Höskuldsson A.: Variable and subset selection in PLS regression. Chemom. Intell. Lab. Syst. 55, 23–38 (2001)
Juditsky A., Nemirovski A.: Functional aggregation for nonparametric regression. Ann. Stat. 28(3), 681–712 (2000)
Kolaczyk E.D., Nowak R.D.: Multiscale likelihood analysis and complexity penalized estimation. Ann. Stat. 32(2), 500–527 (2004)
Le Cam L.: Convergence of estimates under dimensionality restrictions. Ann. Stat. 1, 38–53 (1973)
Le Cam, L.: On local and global properties in the theory of asymptotic normality of experiments. In: Stochastic processes and related topics (Proc. Summer Res. Inst. Stat. Inference for Stochastic Processes, Indiana Univ., Bloomington, Ind., 1974, vol. 1; dedicated to Jerzy Neyman), pp. 13–54. Academic Press, New York (1975)
Lepskiĭ O.V.: A problem of adaptive estimation in Gaussian white noise. Teor. Veroyatnost. i Primenen. 35(3), 459–470 (1990)
Lepskiĭ O.V.: Asymptotically minimax adaptive estimation. I. Upper bounds. Optimally adaptive estimates. Teor. Veroyatnost. i Primenen. 36(4), 645–659 (1991)
Lepskiĭ O.V.: Asymptotically minimax adaptive estimation. II. Schemes without optimal adaptation. Adaptive estimates. Teor. Veroyatnost. i Primenen. 37(3), 468–481 (1992)
Lepskiĭ, O.V.: On problems of adaptive estimation in white Gaussian noise. In: Topics in Nonparametric Estimation. Adv. Soviet Math., vol. 12, pp. 87–106. American Mathematical Society, Providence (1992)
Leung G., Barron A.R.: Information theory and mixing least-squares regressions. IEEE Trans. Inform. Theory 52(8), 3396–3410 (2006)
Lugosi G., Nobel A.: Consistency of data-driven histogram methods for density estimation and classification. Ann. Stat. 24(2), 687–706 (1996)
Massart, P.: Concentration inequalities and model selection. Lecture Notes in Mathematics, vol. 1896. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003. With a foreword by Jean Picard (2007)
Nemirovski, A.: Topics in non-parametric statistics. In: Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Mathematics, vol. 1738, pp. 85–277. Springer, Berlin (2000)
Priestley M.B., Chao M.T.: Non-parametric function fitting. J. R. Stat. Soc. Ser. B 34, 385–392 (1972)
Reynaud-Bouret P.: Adaptive estimation of the intensity of inhomogeneous Poisson processes via concentration inequalities. Probab. Theory Relat. Fields 126(1), 103–153 (2003)
Rigollet P., Tsybakov A.B.: Linear and convex aggregation of density estimators. Math. Methods Stat. 16(3), 260–280 (2007)
Tibshirani R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
Tsybakov, A.B.: Optimal rates of aggregation. In: Proceedings of the 16th Annual Conference on Learning Theory (COLT) and 7th Annual Workshop on Kernel Machines, pp. 303–313. Lecture Notes in Artificial Intelligence, vol. 2777. Springer, Berlin (2003)
Tsybakov, A.B.: Introduction à l’estimation non-paramétrique. Mathématiques & Applications (Berlin) [Mathematics & Applications], vol. 41. Springer, Berlin (2004)
Wegkamp M.: Model selection in nonparametric regression. Ann. Stat. 31, 252–273 (2003)
Yang Y.: Model selection for nonparametric regression. Stat. Sinica 9, 475–499 (1999)
Yang Y.: Combining different procedures for adaptive regression. J. Multivar. Anal. 74(1), 135–161 (2000)
Yang Y.: Mixing strategies for density estimation. Ann. Stat. 28(1), 75–87 (2000)
Yang Y.: Adaptive regression by mixing. J. Am. Stat. Assoc. 96(454), 574–588 (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Baraud, Y. Estimator selection with respect to Hellinger-type risks. Probab. Theory Relat. Fields 151, 353–401 (2011). https://doi.org/10.1007/s00440-010-0302-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-010-0302-y
Keywords
- Estimator selection
- Model selection
- Variable selection
- T-estimator
- Histogram
- Estimator aggregation
- Hellinger loss