Abstract
Suppose X n is an observation, or average of observations, on a discretized signal ξ n that is measured at n time points. The random vector X n has a N(ξ n , σ2 n I) distribution, the mean and variance being unknown. Under squared error loss, the unbiased estimator X n of ξ n can be improved by variable-selection. Consider the candidate estimator ξ n (A) whose i-th component equals the i-th component of X n whenever i/(n+1) lies in A and vanishes otherwise. Allow the set A to range over a large collection of possibilities. A C p -estimator is a candidate estimator that minimizes estimated quadratic loss over A. This paper constructs confidence sets that are centered at a C p -estimator, have correct asymptotic coverage probabiligy for ξ n , and are geometrically smaller than or equal to the competing confidence balls centered at X n . The asymptotics are locally uniform in the parameters (ξ n , σ2 n ). The results illustrate an approach to inference after variable-selection.
Similar content being viewed by others
References
Akaike, H. (1974). A new look at statistical model identification, IEEE Trans. Automat. Control, 19, 716–723.
Alexander, K. S. and Pyke, R. (1986). A uniform central limit theorem for set-indexed partialsum processes with finite variance, Ann. Probab., 14, 582–587.
Beran, R. (1992). The radial process for confidence sets. Probability in Banach Spaces, 8 (eds. R. M. Dudley, M. G. Hahn and J. Kuelbs), 479–496, Birkhäuser, Boston.
Billingsley, P. (1968). Convergence of Probability Measures, Wiley, New York.
Casella, G. and Hwang, J. T. (1982). Limit expressions for the risk of James-Stein estimators, Canad. J. Statist., 10, 305–309.
Donoho, D. L., Liu, R. C. and MacGibbon, B. (1990). Minimax risk over hyperrectangles and implications, Ann. Statist., 18, 1416–1437.
Gasser, T., Sroka, L. and Jennen-Steinmetz, C. (1986). Residual variance and residual pattern in nonlinear regression. Biometrika, 73, 625–633.
James, W. and Stein, C. (1961). Estimation with quadratic loss, Proc. Fourth Berkeley Symp. on Math. Statist. Prob., Vol. 1, 361–380, University of California Press, Berkeley.
Mallows, C. (1973). Some comments on C p , Technometrics, 15, 661–675.
Pinsker, M. S. (1980). Optimal filtration of square-integrable signals in Gaussian noise, Problems Inform. Transmission, 16, 120–133.
Pötscher, B. M. (1991). Effects of model selection on inference, Econom. Theory, 7, 163–185.
Pötscher, B. M. (1995). Comment on “The effect of model selection on confidence regions and prediction regions” by P. Kabaila, Econom. Theory, 11, 550–559.
Rice, J. (1984). Bandwidth choice for nonparametric regression, Ann. Statist., 12, 1215–1230.
Shibata, R. (1981). An optimal selection of regression variables, Biometrika, 68, 45–54.
Speed, T. P. and Yu, B. (1993). Model selection and prediction: normal regression, Ann. Inst. Statist. Math., 45, 35–54.
Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution, Proc. Third Berkeley Symp. on Math. Statist. Prob., Vol. 1 (ed. J.Neyman), 197–206, University of California Press, Berkeley.
Stein, C. (1966). An approach to the recovery of inter-block information in balanced incomplete block designs, Research Papers in Statistics: Festschrift for J. Neyman (ed. F. N. David), 351–366, Wiley, New York.
Author information
Authors and Affiliations
About this article
Cite this article
Beran, R. Confidence sets centered at C p -estimators. Ann Inst Stat Math 48, 1–15 (1996). https://doi.org/10.1007/BF00049285
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00049285