Skip to main content
Log in

Self-normalization: Taming a wild population in a heavy-tailed world

  • Published:
Applied Mathematics-A Journal of Chinese Universities Aims and scope Submit manuscript

Abstract

The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more than a century ago, and their applications to statistical problems in high dimensions, including feature selection and ranking, large-scale multiple testing and sparse, high dimensional signal detection. Many of these applications rely on the robustness property of Studentization/self-normalization against heavy-tailed sampling distributions. This paper gives an overview of the salient progress of self-normalized limit theory, from Student’s t-statistic to more general Studentized nonlinear statistics. Prototypical examples include Studentized one- and two-sample U-statistics. Furthermore, we go beyond independence and glimpse some very recent advances in self-normalized moderate deviations under dependence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T W Anderson. An Introduction to Multivariate Statistical Analysis, 3rd ed, Wiley, Hoboken, 2003.

    MATH  Google Scholar 

  2. J N Arvesen. Jackknifing U-statistics, Ann Math Statist, 1969, 40(6): 2076–2100.

    Article  MathSciNet  MATH  Google Scholar 

  3. Z Bai, H Saranadasa. Effect of high dimension: by an example of a two sample problem, Statist Sinica, 1996, 6(2): 311–329.

    MathSciNet  MATH  Google Scholar 

  4. A Belloni, V Chernozhukov, L Wang. (2011). Square-root lasso: pivotal recovery of sparse signals via conic programming, Biometrika, 2011, 98(4): 791–806.

    Article  MathSciNet  MATH  Google Scholar 

  5. V Bentkus, M Bloznelis, F Götze. A Berry-Esséen bound for student’s statistic in the non-I.I.D. case, J Theoret Probab, 1996, 9(3): 765–796.

    Article  MathSciNet  MATH  Google Scholar 

  6. V Bentkus, F Götze. The Berry-Esseen bound for Student’s statistic, Ann Probab, 1996, 24(1): 491–503.

    Article  MathSciNet  MATH  Google Scholar 

  7. V Bentkus, B Y Jing, Q M Shao, W Zhou. Limiting distributions of the non-central t-statistic and their applications to the power of t-tests under non-normality, Bernoulli, 2007, 13(2): 346–364.

    Article  MathSciNet  MATH  Google Scholar 

  8. B Bercu, E Gassiat, E Rio. Concentration inequalities, large and moderate deviations for selfnormalized empirical processes, Ann Probab, 2002, 30(4): 1576–1604.

    Article  MathSciNet  MATH  Google Scholar 

  9. B Bercu, A Touati. Exponential inequalities for self-normalized martingales with applications, Ann Appl Probab, 2008, 18(5): 1848–1869.

    Article  MathSciNet  MATH  Google Scholar 

  10. M Bloznelis, H Putter. Second-order and bootstrap approximation to Student’s t-statistic, Theory Probab Appl, 2003, 47(2): 300–307.

    Article  MathSciNet  MATH  Google Scholar 

  11. J F Box. Gosset, Fisher, and t distribution, Amer Statist, 1981, 35(2): 61–66.

    MathSciNet  Google Scholar 

  12. P Bühlmann. Bootstrap for time series, Statist Sci, 2002, 17(1): 52–72.

    Article  MathSciNet  MATH  Google Scholar 

  13. H Cao, M R Kosorok. Simultaneous critical values for t-tests in very high dimensions, Bernoulli, 2011, 17(1): 347–394.

    Article  MathSciNet  MATH  Google Scholar 

  14. J Chang, Q M Shao, W X Zhou. Cramér-type moderate deviations for Studentized two-sample U-statistics with applications, Ann Statist, 2016, 44(5): 1931–1956.

    Article  MathSciNet  MATH  Google Scholar 

  15. J Chang, C Y Tang, Y Wu. Marginal empirical likelihood and sure independence feature screening, Ann Statist, 2013, 41(4): 2123–2148.

    Article  MathSciNet  MATH  Google Scholar 

  16. J Chang, C Y Tang, Y Wu. Local independence feature screening for nonparametric and semiparametric models by marginal empirical likelihood, Ann Statist, 2016, 44(2): 515–539.

    Article  MathSciNet  MATH  Google Scholar 

  17. S Chatterjee, Q M Shao. Nonnormal approximation by Stein’s method of exchangeable pairs with application to the Curie-Weiss model, Ann Appl Probab, 2011, 21(2): 464–483.

    Article  MathSciNet  MATH  Google Scholar 

  18. L H Y Chen, Q M Shao. Normal approximation for nonlinear statistics using a concentration inequality approach, Bernoulli, 2007, 13(2): 581–599.

    Article  MathSciNet  MATH  Google Scholar 

  19. X Chen, Q M Shao, W B Wu, L Xu. Self-normalized Cramér-type moderate deviations under dependence, Ann Statist, 2016, 44(4): 1593–1617.

    Article  MathSciNet  MATH  Google Scholar 

  20. H Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations, Ann Math Statist, 1952, 23(4): 493–507.

    Article  MathSciNet  MATH  Google Scholar 

  21. G P Chistyakov, F Götze. On bounds for moderate deviations for Student’s statistic, Theory Probab Appl, 2004, 48(3): 528–535.

    Article  MathSciNet  MATH  Google Scholar 

  22. G P Chistyakov, F Götze. Limit distributions of Studentized means, Ann Probab, 2004, 32(1A): 28–77.

    Article  MathSciNet  MATH  Google Scholar 

  23. E Chung, J Romano. Asymptotically valid and exact permutation tests based on two-sample Ustatistics, J Statist Plann Inference, 2016, 168: 97–105.

    Article  MathSciNet  MATH  Google Scholar 

  24. S Clarke, P Hall. Robustness of multiple testing procedures against dependence, Ann Statist, 2009, 37(1): 332–358.

    Article  MathSciNet  MATH  Google Scholar 

  25. M Csörgő, L Horváth. Asymptotic representations of self-normalized sums, Probab Math Statist, 1988, 9: 15–24.

    MathSciNet  MATH  Google Scholar 

  26. M Csörgő, B Szyszkowicz, Q Wang. Donsker’s theorem for self-normalized partial sums processes, Ann Probab, 2003, 31(3): 1228–1240.

    Article  MathSciNet  MATH  Google Scholar 

  27. V H de la Pe˜na, T L Lai, Q M Shao. Self-Normalized Processes: Theory and Statistical Applications, Springer, Berlin, 2009.

    Book  Google Scholar 

  28. A Delaigle, P Hall, J Jin. Robustness and accuracy of methods for high dimensional data analysis based on Student’s t-statistic, J Roy Statist Soc Ser B, 2011, 73(3): 283–301.

    Article  MathSciNet  Google Scholar 

  29. A Dembo, Q M Shao. Large and moderate deviations for Hotelling’s T 2-statistic, Electron Commun Probab, 2006, 11: 149–159.

    Article  MathSciNet  MATH  Google Scholar 

  30. B Efron. Student’s t-test under symmetry conditions, J Amer Statist Assoc, 1969, 64: 1278–1302.

    MathSciNet  MATH  Google Scholar 

  31. V A Egorov. Estimation of distribution tails for normalized and self-normalized sums, J Math Sci, 2005, 127(1): 1717–1722.

    Article  MATH  MathSciNet  Google Scholar 

  32. C Eisenhart. On the transition from “Student’s” z to “Student’s” t, Amer Statist, 1979, 33(1): 6–10.

    MathSciNet  Google Scholar 

  33. J Fan, Y Fan. High-dimensional classification using features annealed independence rules, Ann Statist, 2008, 36(6): 2605–2637.

    Article  MathSciNet  MATH  Google Scholar 

  34. J Fan, P Hall, Q Yao. To how many simultaneous hypothesis tests can normal, Student’s t or bootstrap calibration be applied, J Amer Statist Assoc, 2007, 102: 1282–1288.

    Article  MathSciNet  MATH  Google Scholar 

  35. J Fan, J Lv. A selective overview of variable selection in high dimensional feature space, Statist Sinica, 2010, 20(1): 101–148.

    MathSciNet  MATH  Google Scholar 

  36. R A Fisher. Applications of “Student’s” distribution, Metron, 1925, 5: 90–104.

    MATH  Google Scholar 

  37. L Gao, Q M Shao, J S Shi. Cramér moderate deviations for a general self-normalized sum, Preprint, 2017.

    Google Scholar 

  38. E Giné, F Götze, D M Mason. When is the Student t-statistic asymptotically standard normal, Ann Probab, 1997, 25(3): 1514–1531.

    Article  MathSciNet  MATH  Google Scholar 

  39. P S Griffin, J D Kuelbs. Self-normalized laws of the iterated logarithm, Ann Probab, 1989, 17(4): 1571–1601.

    Article  MathSciNet  MATH  Google Scholar 

  40. P S Griffin, D M Mason. On the asymptotic normality of self-normalized sums, Math Proc Cambridge Philos Soc, 1991, 109(3): 597–610.

    Article  MathSciNet  MATH  Google Scholar 

  41. P Hall. Edgeworth expansion for Student’s t statistic under minimal moment conditions, Ann Probab, 1987, 15(3): 920–931.

    Article  MathSciNet  MATH  Google Scholar 

  42. P Hall. On the effect of random norming on the rate of convergence in the central limit theorem, Ann Probab, 1988, 16(3): 1265–1280.

    Article  MathSciNet  MATH  Google Scholar 

  43. P Hall, Q Wang. Exact convergence rate and leading term in central limit theorem for Student’s t statistic, Ann Probab, 2004, 32(2): 1419–1437.

    Article  MathSciNet  MATH  Google Scholar 

  44. W Hoeffding. A class of statistics with asymptotically normal distribution, Ann Math Statist, 1948, 19(3): 293–325.

    Article  MathSciNet  MATH  Google Scholar 

  45. H Hotelling. The generalization of Student’s ratio, Ann Math Statist, 1931, 2(3): 360–378.

    Article  MATH  Google Scholar 

  46. B Y Jing, Q M Shao, Q Wang. Self-normalized Cramér-type large deviations for independent random variables, Ann Probab, 2003, 31(4): 2167–2215.

    Article  MathSciNet  MATH  Google Scholar 

  47. B Y Jing, Q M Shao, W Zhou. Saddlepoint approximation for Student’s t-statistic with no moment conditions, Ann Statist, 2004, 32(6): 2679–2711.

    Article  MathSciNet  MATH  Google Scholar 

  48. B Y Jing, Q M Shao, W Zhou. Towards a universal self-normalized moderate deviation, Trans Amer Math Soc, 2008, 360: 4263–4285.

    Article  MathSciNet  MATH  Google Scholar 

  49. M Juodis, A Račkauskas. A remark on self-normalization for dependent random variables, Lith Math J, 2005, 45(2): 142–151.

    Article  MathSciNet  MATH  Google Scholar 

  50. S N Lahiri. Resampling Methods for Dependent Data, Springer, New York, 2003.

    Book  MATH  Google Scholar 

  51. T L Lai, Q M Shao, Q Wang. Cramér type moderate deviations for Studentized U-statistics, ESAIM Probab Stat, 2011, 15: 168–179.

    Article  MathSciNet  MATH  Google Scholar 

  52. J V Linnik. On the probability of large deviations for the sums of independent variables, Proc 4th Berkeley Sympos Math Statist and Prob, Vol II, Univ California Press, Berkeley, 1961, 289–306.

    Google Scholar 

  53. W Liu, Q M Shao. A Cramér type moderate deviation theorem for Hotelling’s T 2-statistic with applications to global tests, Ann Statist, 2013, 41(1): 296–322.

    Article  MathSciNet  MATH  Google Scholar 

  54. W Liu, Q M Shao. Phase transition and regularized bootstrap in large-scale t-tests with false discovery rate control, Ann Statist, 2014, 42(5): 2003–2025.

    Article  MathSciNet  MATH  Google Scholar 

  55. B F Logan, C L Mallow, S O Rice, L A Shepp. Limit distributions of self-normalized sums, Ann Probab, 1973, 1(5): 788–809.

    Article  MathSciNet  MATH  Google Scholar 

  56. R A Maller. A theorem on products of random variables with application to regression, Aust N Z J Stat, 1981, 23(2): 177–185.

    Article  MathSciNet  MATH  Google Scholar 

  57. D M Mason. The asymptotic distribution of self-normalized triangular arrays, J Theoret Probab, 2005, 18(4): 853–870.

    Article  MathSciNet  MATH  Google Scholar 

  58. S Y Novak. On self-normalized sums of random variables and the Student’s statistic, Theory Probab Appl, 2005, 49(2): 336–344.

    Article  MathSciNet  MATH  Google Scholar 

  59. G M Pan, W Zhou. Central limit theorem for Hotelling’s T 2 statistic under large dimension, Ann Appl Probab, 2011, 21(5): 1860–1910.

    Article  MathSciNet  MATH  Google Scholar 

  60. E S Pearson. Some reflections on continuity in the development of mathematical statistics, 1885–1920, Biometrika, 1967, 54: 341–355.

    MathSciNet  MATH  Google Scholar 

  61. D N Politis, J P Romano, M Wolf. Subsampling, Springer, New York, 1999.

    Book  MATH  Google Scholar 

  62. J Robinson, Q Wang. On the self-normalized Cramér-type large deviation, J Theoret Probab, 2005, 18(4): 891–909.

    Article  MathSciNet  MATH  Google Scholar 

  63. Q M Shao. Self-normalized large deviations, Ann Probab, 1997, 25(1): 285–328.

    Article  MathSciNet  MATH  Google Scholar 

  64. Q M Shao. A Cramér type large deviation result for Student’s t statistic, J Theoret Probab, 1999, 12(2): 385–398.

    Article  MathSciNet  MATH  Google Scholar 

  65. Q M Shao. An explicit Berry-Esseen bound for Student’s t-statistic via Stein’s method, In: Stein’s Method and Applications, Lect Notes Ser Inst Math Sci Natl Univ Singap 5, Singapore University Press, Singapore, 2005, 143–155.

    Google Scholar 

  66. Q M Shao, Q Wang. Self-normalized limit theorems: a survey, Probab Surv, 2013, 10: 69–93.

    Article  MathSciNet  MATH  Google Scholar 

  67. Q M Shao, Z S Zhang. Identifying the limiting distribution by a general approach of Stein’s method, Sci China Math, 2016, 59(12): 2379–2392.

    Article  MathSciNet  MATH  Google Scholar 

  68. Q M Shao, K Zhang, W X Zhou. Stein’s method for nonlinear statistics: a brief survey and recent progress, J Statist Plann Inference, 2016, 168: 68–89.

    Article  MathSciNet  MATH  Google Scholar 

  69. Q M Shao, W X Zhou. Cramér type moderate deviation theorems for self-normalized processes, Bernoulli, 2016, 22(4): 2029–2079.

    Article  MathSciNet  MATH  Google Scholar 

  70. V V Slavova. On the Berry-Esseen bound for Student’s statistics, In: Stability Problems for Stochastic Models, Lecture Notes in Math, 1155, Springer, Berlin, 1985, 355–390.

    Google Scholar 

  71. Student. The probable error of a mean, Biometrika, 1908, 6: 1–25.

    Article  Google Scholar 

  72. M Vandemaele, N Veraverbeke. Cramér type large deviations for Studentized U-statistics, Metrika, 1985, 32(1): 165–179.

    Article  MathSciNet  MATH  Google Scholar 

  73. Q Wang. Bernstein type inequalities for degenerate U-statistics with applications, Chin Ann Math, 1998, 19(2): 157–166.

    MathSciNet  MATH  Google Scholar 

  74. Q Wang. Limit theorems for self-normalized large deviation, Electron J Probab, 2005, 10: 1260–1285.

    Article  MathSciNet  MATH  Google Scholar 

  75. Q Wang. Refined self-normalized large deviations for independent random variables, J Theoret Probab, 2011, 24(2): 307–329.

    Article  MathSciNet  MATH  Google Scholar 

  76. Q Wang, B Y Jing. An exponential nonuniform Berry-Esseen bound for self-normalized sums, Ann Probab, 1999, 27(4): 2068–2088.

    Article  MathSciNet  MATH  Google Scholar 

  77. Q Wang, B Y Jing, L Zhao. The Berry-Esseen bound for Studentized statistics, Ann Probab, 2000, 28(1): 511–535.

    Article  MathSciNet  MATH  Google Scholar 

  78. S L Zabell. On Student’s 1908 article “The probable error of a mean.”, J Amer Statist Assoc, 2008, 103: 1–7.

    Article  MathSciNet  MATH  Google Scholar 

  79. W Zhou, B Y Jing. Tail probability approximations for Student’s t-statistics, Probab Theory Related Fields, 2006, 136(4): 541–559.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi-man Shao.

Additional information

Supported by Hong Kong RGC GRF 14302515.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shao, Qm., Zhou, Wx. Self-normalization: Taming a wild population in a heavy-tailed world. Appl. Math. J. Chin. Univ. 32, 253–269 (2017). https://doi.org/10.1007/s11766-017-3552-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11766-017-3552-y

MR Subject Classification

Keywords

Navigation