Skip to main content
Log in

Single-pass low-storage arbitrary quantile estimation for massive datasets

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We present a single-pass, low-storage, sequential method for estimating an arbitrary quantile of an unknown distribution. The proposed method performs very well when compared to existing methods for estimating the median as well as arbitrary quantiles for a wide range of densities. In addition to explaining the method and presenting the results of the simulation study, we discuss intuition behind the method and demonstrate empirically, for certain densities, that the proposed estimator converges to the sample quantile.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bahadur R.R. 1966. A note on quantiles in large samples. Annals of Mathematical Statistics 37: 577–580.

    Google Scholar 

  • Breiman L., Gins J., and Stone C. 1979. New Methods for Estimating Tail Probabilities and Extreme Value Distributions. Technology Service Corp. Santa Monica, CA, TSC-PD-A2261.

    Google Scholar 

  • Chao M.T. and Lin G.D. 1993. The asympotic distribution of the remedians. Journal of Statistical Planning and Inference 37: 1–11.

    Google Scholar 

  • Dunn C.L. 1991. Precise simulated percentiles in a pinch. The American Statistician 45(3): 207–211.

    Google Scholar 

  • Hurley C. and Modarres R. 1995. Low-Storage quantile estimation. Computational Statistics 10(4): 311–325.

    Google Scholar 

  • Kesidis G. 1999. Bandwidth adjustments using on-line packet-level adjustments. In: SPIE Conference on Performance and Control of Network Systems, Boston, Sept. 19-22.

  • Krutchkoff R.G. 1986. Percentiles by simulation: Reducing time and storage. Journal of Statistical Computation and Simulation 25: 304–305.

    Google Scholar 

  • Manku G.S., Rajagopalan S., and Lindsay B.G. 1998. Approximate medians and other quantiles in one pass and with limited memory. In: Proc. ACM SIGMOD International Conf. on Management of Data June, pp. 426–435.

  • Pearl J. 1981. A space-efficient on-line method of computing quantile estimates. Journal of Algorithms 2: 164–177.

    Google Scholar 

  • Ott W.R. 1995. Environmental Statistics and Data Analysis. Lewis Publishers.

  • Pfanzagl J. 1974. Investigating the quantile of an unknown distribution. Contributions to Applied Statistics, Ziegler W.J. (Ed.), Birkhauser Verlag, Basel, pp. 111–126.

    Google Scholar 

  • Rousseeuw P.J. and Bassett G.W. 1990. The remedian: A robust averaging method for large datasets. Journal of the American Statistical Association 85(409): 97–104.

    Google Scholar 

  • Serfling R.J. 1980. Approximation Theorems of Mathematical Statistics. Wiley, New York.

    Google Scholar 

  • Tierney L. 1983. A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM Journal on Scientific and Statistical Computing 4(4): 706–711.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liechty, J.C., Lin, D.K.J. & McDermott, J.P. Single-pass low-storage arbitrary quantile estimation for massive datasets. Statistics and Computing 13, 91–100 (2003). https://doi.org/10.1023/A:1023296123228

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023296123228

Navigation