Abstract
We present a single-pass, low-storage, sequential method for estimating an arbitrary quantile of an unknown distribution. The proposed method performs very well when compared to existing methods for estimating the median as well as arbitrary quantiles for a wide range of densities. In addition to explaining the method and presenting the results of the simulation study, we discuss intuition behind the method and demonstrate empirically, for certain densities, that the proposed estimator converges to the sample quantile.
Similar content being viewed by others
References
Bahadur R.R. 1966. A note on quantiles in large samples. Annals of Mathematical Statistics 37: 577–580.
Breiman L., Gins J., and Stone C. 1979. New Methods for Estimating Tail Probabilities and Extreme Value Distributions. Technology Service Corp. Santa Monica, CA, TSC-PD-A2261.
Chao M.T. and Lin G.D. 1993. The asympotic distribution of the remedians. Journal of Statistical Planning and Inference 37: 1–11.
Dunn C.L. 1991. Precise simulated percentiles in a pinch. The American Statistician 45(3): 207–211.
Hurley C. and Modarres R. 1995. Low-Storage quantile estimation. Computational Statistics 10(4): 311–325.
Kesidis G. 1999. Bandwidth adjustments using on-line packet-level adjustments. In: SPIE Conference on Performance and Control of Network Systems, Boston, Sept. 19-22.
Krutchkoff R.G. 1986. Percentiles by simulation: Reducing time and storage. Journal of Statistical Computation and Simulation 25: 304–305.
Manku G.S., Rajagopalan S., and Lindsay B.G. 1998. Approximate medians and other quantiles in one pass and with limited memory. In: Proc. ACM SIGMOD International Conf. on Management of Data June, pp. 426–435.
Pearl J. 1981. A space-efficient on-line method of computing quantile estimates. Journal of Algorithms 2: 164–177.
Ott W.R. 1995. Environmental Statistics and Data Analysis. Lewis Publishers.
Pfanzagl J. 1974. Investigating the quantile of an unknown distribution. Contributions to Applied Statistics, Ziegler W.J. (Ed.), Birkhauser Verlag, Basel, pp. 111–126.
Rousseeuw P.J. and Bassett G.W. 1990. The remedian: A robust averaging method for large datasets. Journal of the American Statistical Association 85(409): 97–104.
Serfling R.J. 1980. Approximation Theorems of Mathematical Statistics. Wiley, New York.
Tierney L. 1983. A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM Journal on Scientific and Statistical Computing 4(4): 706–711.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Liechty, J.C., Lin, D.K.J. & McDermott, J.P. Single-pass low-storage arbitrary quantile estimation for massive datasets. Statistics and Computing 13, 91–100 (2003). https://doi.org/10.1023/A:1023296123228
Issue Date:
DOI: https://doi.org/10.1023/A:1023296123228