Abstract
The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more than a century ago, and their applications to statistical problems in high dimensions, including feature selection and ranking, large-scale multiple testing and sparse, high dimensional signal detection. Many of these applications rely on the robustness property of Studentization/self-normalization against heavy-tailed sampling distributions. This paper gives an overview of the salient progress of self-normalized limit theory, from Student’s t-statistic to more general Studentized nonlinear statistics. Prototypical examples include Studentized one- and two-sample U-statistics. Furthermore, we go beyond independence and glimpse some very recent advances in self-normalized moderate deviations under dependence.
Similar content being viewed by others
References
T W Anderson. An Introduction to Multivariate Statistical Analysis, 3rd ed, Wiley, Hoboken, 2003.
J N Arvesen. Jackknifing U-statistics, Ann Math Statist, 1969, 40(6): 2076–2100.
Z Bai, H Saranadasa. Effect of high dimension: by an example of a two sample problem, Statist Sinica, 1996, 6(2): 311–329.
A Belloni, V Chernozhukov, L Wang. (2011). Square-root lasso: pivotal recovery of sparse signals via conic programming, Biometrika, 2011, 98(4): 791–806.
V Bentkus, M Bloznelis, F Götze. A Berry-Esséen bound for student’s statistic in the non-I.I.D. case, J Theoret Probab, 1996, 9(3): 765–796.
V Bentkus, F Götze. The Berry-Esseen bound for Student’s statistic, Ann Probab, 1996, 24(1): 491–503.
V Bentkus, B Y Jing, Q M Shao, W Zhou. Limiting distributions of the non-central t-statistic and their applications to the power of t-tests under non-normality, Bernoulli, 2007, 13(2): 346–364.
B Bercu, E Gassiat, E Rio. Concentration inequalities, large and moderate deviations for selfnormalized empirical processes, Ann Probab, 2002, 30(4): 1576–1604.
B Bercu, A Touati. Exponential inequalities for self-normalized martingales with applications, Ann Appl Probab, 2008, 18(5): 1848–1869.
M Bloznelis, H Putter. Second-order and bootstrap approximation to Student’s t-statistic, Theory Probab Appl, 2003, 47(2): 300–307.
J F Box. Gosset, Fisher, and t distribution, Amer Statist, 1981, 35(2): 61–66.
P Bühlmann. Bootstrap for time series, Statist Sci, 2002, 17(1): 52–72.
H Cao, M R Kosorok. Simultaneous critical values for t-tests in very high dimensions, Bernoulli, 2011, 17(1): 347–394.
J Chang, Q M Shao, W X Zhou. Cramér-type moderate deviations for Studentized two-sample U-statistics with applications, Ann Statist, 2016, 44(5): 1931–1956.
J Chang, C Y Tang, Y Wu. Marginal empirical likelihood and sure independence feature screening, Ann Statist, 2013, 41(4): 2123–2148.
J Chang, C Y Tang, Y Wu. Local independence feature screening for nonparametric and semiparametric models by marginal empirical likelihood, Ann Statist, 2016, 44(2): 515–539.
S Chatterjee, Q M Shao. Nonnormal approximation by Stein’s method of exchangeable pairs with application to the Curie-Weiss model, Ann Appl Probab, 2011, 21(2): 464–483.
L H Y Chen, Q M Shao. Normal approximation for nonlinear statistics using a concentration inequality approach, Bernoulli, 2007, 13(2): 581–599.
X Chen, Q M Shao, W B Wu, L Xu. Self-normalized Cramér-type moderate deviations under dependence, Ann Statist, 2016, 44(4): 1593–1617.
H Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations, Ann Math Statist, 1952, 23(4): 493–507.
G P Chistyakov, F Götze. On bounds for moderate deviations for Student’s statistic, Theory Probab Appl, 2004, 48(3): 528–535.
G P Chistyakov, F Götze. Limit distributions of Studentized means, Ann Probab, 2004, 32(1A): 28–77.
E Chung, J Romano. Asymptotically valid and exact permutation tests based on two-sample Ustatistics, J Statist Plann Inference, 2016, 168: 97–105.
S Clarke, P Hall. Robustness of multiple testing procedures against dependence, Ann Statist, 2009, 37(1): 332–358.
M Csörgő, L Horváth. Asymptotic representations of self-normalized sums, Probab Math Statist, 1988, 9: 15–24.
M Csörgő, B Szyszkowicz, Q Wang. Donsker’s theorem for self-normalized partial sums processes, Ann Probab, 2003, 31(3): 1228–1240.
V H de la Pe˜na, T L Lai, Q M Shao. Self-Normalized Processes: Theory and Statistical Applications, Springer, Berlin, 2009.
A Delaigle, P Hall, J Jin. Robustness and accuracy of methods for high dimensional data analysis based on Student’s t-statistic, J Roy Statist Soc Ser B, 2011, 73(3): 283–301.
A Dembo, Q M Shao. Large and moderate deviations for Hotelling’s T 2-statistic, Electron Commun Probab, 2006, 11: 149–159.
B Efron. Student’s t-test under symmetry conditions, J Amer Statist Assoc, 1969, 64: 1278–1302.
V A Egorov. Estimation of distribution tails for normalized and self-normalized sums, J Math Sci, 2005, 127(1): 1717–1722.
C Eisenhart. On the transition from “Student’s” z to “Student’s” t, Amer Statist, 1979, 33(1): 6–10.
J Fan, Y Fan. High-dimensional classification using features annealed independence rules, Ann Statist, 2008, 36(6): 2605–2637.
J Fan, P Hall, Q Yao. To how many simultaneous hypothesis tests can normal, Student’s t or bootstrap calibration be applied, J Amer Statist Assoc, 2007, 102: 1282–1288.
J Fan, J Lv. A selective overview of variable selection in high dimensional feature space, Statist Sinica, 2010, 20(1): 101–148.
R A Fisher. Applications of “Student’s” distribution, Metron, 1925, 5: 90–104.
L Gao, Q M Shao, J S Shi. Cramér moderate deviations for a general self-normalized sum, Preprint, 2017.
E Giné, F Götze, D M Mason. When is the Student t-statistic asymptotically standard normal, Ann Probab, 1997, 25(3): 1514–1531.
P S Griffin, J D Kuelbs. Self-normalized laws of the iterated logarithm, Ann Probab, 1989, 17(4): 1571–1601.
P S Griffin, D M Mason. On the asymptotic normality of self-normalized sums, Math Proc Cambridge Philos Soc, 1991, 109(3): 597–610.
P Hall. Edgeworth expansion for Student’s t statistic under minimal moment conditions, Ann Probab, 1987, 15(3): 920–931.
P Hall. On the effect of random norming on the rate of convergence in the central limit theorem, Ann Probab, 1988, 16(3): 1265–1280.
P Hall, Q Wang. Exact convergence rate and leading term in central limit theorem for Student’s t statistic, Ann Probab, 2004, 32(2): 1419–1437.
W Hoeffding. A class of statistics with asymptotically normal distribution, Ann Math Statist, 1948, 19(3): 293–325.
H Hotelling. The generalization of Student’s ratio, Ann Math Statist, 1931, 2(3): 360–378.
B Y Jing, Q M Shao, Q Wang. Self-normalized Cramér-type large deviations for independent random variables, Ann Probab, 2003, 31(4): 2167–2215.
B Y Jing, Q M Shao, W Zhou. Saddlepoint approximation for Student’s t-statistic with no moment conditions, Ann Statist, 2004, 32(6): 2679–2711.
B Y Jing, Q M Shao, W Zhou. Towards a universal self-normalized moderate deviation, Trans Amer Math Soc, 2008, 360: 4263–4285.
M Juodis, A Račkauskas. A remark on self-normalization for dependent random variables, Lith Math J, 2005, 45(2): 142–151.
S N Lahiri. Resampling Methods for Dependent Data, Springer, New York, 2003.
T L Lai, Q M Shao, Q Wang. Cramér type moderate deviations for Studentized U-statistics, ESAIM Probab Stat, 2011, 15: 168–179.
J V Linnik. On the probability of large deviations for the sums of independent variables, Proc 4th Berkeley Sympos Math Statist and Prob, Vol II, Univ California Press, Berkeley, 1961, 289–306.
W Liu, Q M Shao. A Cramér type moderate deviation theorem for Hotelling’s T 2-statistic with applications to global tests, Ann Statist, 2013, 41(1): 296–322.
W Liu, Q M Shao. Phase transition and regularized bootstrap in large-scale t-tests with false discovery rate control, Ann Statist, 2014, 42(5): 2003–2025.
B F Logan, C L Mallow, S O Rice, L A Shepp. Limit distributions of self-normalized sums, Ann Probab, 1973, 1(5): 788–809.
R A Maller. A theorem on products of random variables with application to regression, Aust N Z J Stat, 1981, 23(2): 177–185.
D M Mason. The asymptotic distribution of self-normalized triangular arrays, J Theoret Probab, 2005, 18(4): 853–870.
S Y Novak. On self-normalized sums of random variables and the Student’s statistic, Theory Probab Appl, 2005, 49(2): 336–344.
G M Pan, W Zhou. Central limit theorem for Hotelling’s T 2 statistic under large dimension, Ann Appl Probab, 2011, 21(5): 1860–1910.
E S Pearson. Some reflections on continuity in the development of mathematical statistics, 1885–1920, Biometrika, 1967, 54: 341–355.
D N Politis, J P Romano, M Wolf. Subsampling, Springer, New York, 1999.
J Robinson, Q Wang. On the self-normalized Cramér-type large deviation, J Theoret Probab, 2005, 18(4): 891–909.
Q M Shao. Self-normalized large deviations, Ann Probab, 1997, 25(1): 285–328.
Q M Shao. A Cramér type large deviation result for Student’s t statistic, J Theoret Probab, 1999, 12(2): 385–398.
Q M Shao. An explicit Berry-Esseen bound for Student’s t-statistic via Stein’s method, In: Stein’s Method and Applications, Lect Notes Ser Inst Math Sci Natl Univ Singap 5, Singapore University Press, Singapore, 2005, 143–155.
Q M Shao, Q Wang. Self-normalized limit theorems: a survey, Probab Surv, 2013, 10: 69–93.
Q M Shao, Z S Zhang. Identifying the limiting distribution by a general approach of Stein’s method, Sci China Math, 2016, 59(12): 2379–2392.
Q M Shao, K Zhang, W X Zhou. Stein’s method for nonlinear statistics: a brief survey and recent progress, J Statist Plann Inference, 2016, 168: 68–89.
Q M Shao, W X Zhou. Cramér type moderate deviation theorems for self-normalized processes, Bernoulli, 2016, 22(4): 2029–2079.
V V Slavova. On the Berry-Esseen bound for Student’s statistics, In: Stability Problems for Stochastic Models, Lecture Notes in Math, 1155, Springer, Berlin, 1985, 355–390.
Student. The probable error of a mean, Biometrika, 1908, 6: 1–25.
M Vandemaele, N Veraverbeke. Cramér type large deviations for Studentized U-statistics, Metrika, 1985, 32(1): 165–179.
Q Wang. Bernstein type inequalities for degenerate U-statistics with applications, Chin Ann Math, 1998, 19(2): 157–166.
Q Wang. Limit theorems for self-normalized large deviation, Electron J Probab, 2005, 10: 1260–1285.
Q Wang. Refined self-normalized large deviations for independent random variables, J Theoret Probab, 2011, 24(2): 307–329.
Q Wang, B Y Jing. An exponential nonuniform Berry-Esseen bound for self-normalized sums, Ann Probab, 1999, 27(4): 2068–2088.
Q Wang, B Y Jing, L Zhao. The Berry-Esseen bound for Studentized statistics, Ann Probab, 2000, 28(1): 511–535.
S L Zabell. On Student’s 1908 article “The probable error of a mean.”, J Amer Statist Assoc, 2008, 103: 1–7.
W Zhou, B Y Jing. Tail probability approximations for Student’s t-statistics, Probab Theory Related Fields, 2006, 136(4): 541–559.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by Hong Kong RGC GRF 14302515.
Rights and permissions
About this article
Cite this article
Shao, Qm., Zhou, Wx. Self-normalization: Taming a wild population in a heavy-tailed world. Appl. Math. J. Chin. Univ. 32, 253–269 (2017). https://doi.org/10.1007/s11766-017-3552-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11766-017-3552-y