Bootstrapping sample quantiles of discrete data

Article

Abstract

Sample quantiles are consistent estimators for the true quantile and satisfy central limit theorems (CLTs) if the underlying distribution is continuous. If the distribution is discrete, the situation is much more delicate. In this case, sample quantiles are known to be not even consistent in general for the population quantiles. In a motivating example, we show that Efron’s bootstrap does not consistently mimic the distribution of sample quantiles even in the discrete independent and identically distributed (i.i.d.) data case. To overcome this bootstrap inconsistency, we provide two different and complementing strategies. In the first part of this paper, we prove that \(m\)-out-of-\(n\)-type bootstraps do consistently mimic the distribution of sample quantiles in the discrete data case. As the corresponding bootstrap confidence intervals tend to be conservative due to the discreteness of the true distribution, we propose randomization techniques to construct bootstrap confidence sets of asymptotically correct size. In the second part, we consider a continuous modification of the cumulative distribution function and make use of mid-quantiles studied in Ma et al. (Ann Inst Stat Math 63:227–243, 2011). Contrary to ordinary quantiles and due to continuity, mid-quantiles lose their discrete nature and can be estimated consistently. Moreover, Ma et al. (Ann Inst Stat Math 63:227–243, 2011) proved (non-)central limit theorems for i.i.d. data, which we generalize to the time series case. However, as the mid-quantile function fails to be differentiable, classical i.i.d. or block bootstrap methods do not lead to completely satisfactory results and \(m\)-out-of-\(n\) variants are required here as well. The finite sample performances of both approaches are illustrated in a simulation study by comparing coverage rates of bootstrap confidence intervals.

Keywords

Bootstrap inconsistency Count processes Mid-distribution function \(m\)-Out-of-\(n\) bootstrap Integer-valued processes 

References

  1. Angus, J. E. (1993). Asymptotic theory for bootstrapping the extremes. Communications in Statistics–Theory and Methods, 22, 15–30.MathSciNetCrossRefMATHGoogle Scholar
  2. Athreya, K. B., Fukuchi, J. (1994). Bootstrapping extremes of i.i.d random variables. In J. Galambos, J. Lechner, E. Simiu (Eds.), Proceedings of the Conference on Extreme Value Theory and Applications, NIST Special Publication 866 (vol. 3). New York: Springer.Google Scholar
  3. Athreya, K. B., Fukuchi, J. (1997). Confidence intervals for endpoints of a c.d.f. via bootstrap. Journal of Statistical Planning and Inference, 58, 299–320.Google Scholar
  4. Athreya, K. B., Fukuchi, J., Lahiri, S. N. (1999). On the bootstrap and the moving block bootstrap for the maximum of a stationary process. Journal of Statistical Planning and Inference, 76(1–2), 1–17.Google Scholar
  5. Bickel, P. J., Friedman, D. A. (1981). Some asymptotic theory for the bootstrap. The Annals of Statistics, 9, 1196–1217.Google Scholar
  6. Bickel, P. J., Sakov, A. (2008). On the choice of \(m\) in the \(m\) out of \(n\) bootstrap and confidence bounds for extrema. Statistica Sinica, 18, 967–985.Google Scholar
  7. Billingsley, P. (1995). Probability and measure. New York: Wiley.MATHGoogle Scholar
  8. Chen, J., Lazar, N. A. (2010). Quantile estimation for discrete data via empirical likelihood. Journal of Nonparametric Statistics, 22, 237–255.Google Scholar
  9. Dedecker, J., Prieur, C. (2004). Couplage pour la distance minimale. Comptes Rendus Mathematique, 338, 805–808.Google Scholar
  10. Dedecker, J., Prieur, C. (2005). New dependence coefficients. Examples and applications to statistics. Probability Theory and Related Fields, 132, 203–236.Google Scholar
  11. Deheuvels, P., Mason, D., Shorack, G. (1993). Some results on the influence of extremes on the bootstrap. Annales de l’Institut Henri Poincaré, 29, 83–103.Google Scholar
  12. Del Barrio, E., Janssen, A., Pauly, M. (2013). The m(n) out of k(n) bootstrap for partial sums of St. Petersburg type games. Electronic Communications in Probability, 18, 1–10.Google Scholar
  13. Doukhan, P., Fokianos, K., Li, X. (2012a). On weak dependence conditions: the case of discrete valued processes. Statistics and Probability Letters, 82, 1941–1948.Google Scholar
  14. Doukhan, P., Fokianos, K., Tjøstheim, D. (2012b). On weak dependence conditions for Poisson autoregressions. Statistics and Probability Letters, 82, 942–948.Google Scholar
  15. Drost, F. C., van den Akker, R., Werker, B. J. M. (2009). Efficient estimation of auto-regression parameters and innovation distributions for semiparametric integer-valued AR(p) models. Journal of the Royal Statistical Society, Series B, 71, 467–485.Google Scholar
  16. Efron, B. (1979). Bootstrap: another look at the jackknife. The Annals of Statistics, 7, 1–26.MathSciNetCrossRefMATHGoogle Scholar
  17. Ferland, R., Latour, A., Oraichi, D. (2006). Integer count GARCH processes. Journal of Time Series Analysis, 27, 923–942.Google Scholar
  18. Fokianos, K. (2011). Some recent progress in count time series. Statistics: A Journal of Theoretical and Applied Statistics, 45, 49–58.MathSciNetCrossRefMATHGoogle Scholar
  19. Fokianos, K., Rahbek, A., Tjostheim, D. (2009). Poisson autoregression. Journal of the American Statistical Association-Theory and Methods, 104, 1430–1439.Google Scholar
  20. Harrell, F. E., Davis, C. E. (1982). A new distribution-free quantile estimator. Biometrika, 62, 635–640.Google Scholar
  21. Horowitz, J. (2001). The bootstrap. In: J. J. Heckman, E. E. Leamer (Ed.), Handbook of Econometrics 5, chapter 52 (pp. 3159–3228). Elsevier.Google Scholar
  22. Krantz, S. G. (1991). Real analysis and foundations. Boca Raton: CRC Press.MATHGoogle Scholar
  23. Leucht, A., Neumann, M. H. (2013). Dependent wild bootstrap for degenerate U- and V-statistics. Journal of Multivariate Analysis, 117, 257–280.Google Scholar
  24. Ma, Y., Genton, M. G., Parzen, E. (2011). Asymptotic properties of sample quantiles of discrete distributions. Annals of the Institute of Statistical Mathematics, 63, 227–243.Google Scholar
  25. Mammen, E. (1992). When does the bootstrap work?: Asymptotic results and simulations. New York, Heidelberg: Springer.Google Scholar
  26. McKenzie, E. (1988). ARMA models for dependent sequences of Poisson counts. Advances in Applied Probability, 20, 822–835.MathSciNetCrossRefMATHGoogle Scholar
  27. Parzen, E. (1997). Concrete statistics. In S. Ghosh, W. R. Schucany, W. B. Smith (Eds.), Statistics in quality (pp. 309–332). New York: Marcel Dekker.Google Scholar
  28. Parzen, E. (2004). Quantile probability and statistical data modeling. Statistical Science, 19, 652–662.MathSciNetCrossRefMATHGoogle Scholar
  29. Pollard, D. (1984). Convergence of stochastic processes. New York: Springer.CrossRefMATHGoogle Scholar
  30. Resnick, S. I. (1987). Extreme values, regular variation, and point processes. New York: Springer.CrossRefMATHGoogle Scholar
  31. Santana, L. (2009). Contributions to the \(m\)-out-of-\(n\) bootstrap. Dissertation. North-West University, Potchefstroom Campus, South Africa.Google Scholar
  32. Serfling, R. J. (2002). Approximation theorems of mathematical statistics. New York: Wiley.MATHGoogle Scholar
  33. Shao, J., Chen, Y. (1998). Bootstrapping sample quantiles based on complex survey data under hot deck imputation. Statistica Sinica, 8, 1071–1086.Google Scholar
  34. Sharipov, O S. H, Wendler, M. (2013). Normal limits, nonnormal limits, and the bootstrap for quantiles of dependent data. Statistics and Probability Letters, 83, 1028–1035.Google Scholar
  35. Sun, S., Lahiri, S. N. (2006). Bootstrapping the Sample Quantile of a Weakly Dependent Sequence. Shankya, 68, 130–166.Google Scholar
  36. Swanepoel, J. W. H. (1986). A note on proving that the (modified) bootstrap works. Communications in Statistics - Theory and Methods, 15, 3193–3203.MathSciNetCrossRefMATHGoogle Scholar
  37. Tempelmeier, H. (2000). Inventory service-levels in customer supply chain. OR Spektrum, 22, 361–380.MathSciNetCrossRefMATHGoogle Scholar
  38. Thas, O., De Neve, J., Clement, L., Otooy, J.-P. (2012). Probabilistic index models. Journal of the Royal Statistical Society, Series B 74, Part 4, 623–671.Google Scholar
  39. Wang, D., & Hutson, A. D. (2011). A fractional order statistic towards defining a smooth quantile function for discrete data. Journal of Statistical Planning and Inference, 141, 3142–3150.MathSciNetCrossRefMATHGoogle Scholar
  40. Weiß, C. H. (2008). Thinning operations for modeling time series of counts—a survey. Advances in Statistical Analysis, 92, 319–341.MathSciNetCrossRefGoogle Scholar
  41. Wieczorek, B. (2014). Blockwise bootstrap of the estimated empirical process based on \(\psi \)-weakly dependent observations (Submitted).Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2015

Authors and Affiliations

  1. 1.Department of EconomicsUniversity of MannheimMannheimGermany
  2. 2.Institut für Mathematische StochastikTechnische Universität BraunschweigBraunschweigGermany

Personalised recommendations