Advertisement

Smooth estimates of multiple quantiles in dynamically varying data streams

  • Hugo Lewi Hammer
  • Anis YazidiEmail author
Theoretical advances
  • 16 Downloads

Abstract

In this paper, we investigate the problem of estimating multiple quantiles when samples are received online (data stream). We assume a dynamical system, i.e., the distribution of the samples from the data stream changes with time. A major challenge of using incremental quantile estimators to track multiple quantiles is that we are not guaranteed that the monotone property of quantiles will be satisfied, i.e, an estimate of a lower quantile might erroneously overpass that of a higher quantile estimate. Surprisingly, we have only found two papers in the literature that attempt to counter these challenges, namely the works of Cao et al. (Proceedings of the first ACM workshop on mobile internet through cellular networks, ACM, 2009) and Hammer and Yazidi (Proceedings of the 30th international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), France, Springer, 2017) where the latter is a preliminary version of the work in this paper. Furthermore, the state-of-the-art incremental quantile estimator called deterministic update-based multiplicative incremental quantile estimator (DUMIQE), due to Yazidi and Hammer (IEEE Trans Cybernet, 2017), fails to guarantee the monotone property when estimating multiple quantiles. A challenge with the solutions, in Cao et al. (2009) and Hammer and Yazidi (2017), is that even though the estimates satisfy the monotone property of quantiles, the estimates can be highly irregular relative to each other which usually is unrealistic from a practical point of view. In this paper, we suggest to generate the quantile estimates by inserting the quantile probabilities (e.g., \(0.1, 0.2, \ldots , 0.9\)) into a monotonically increasing and infinitely smooth function (can be differentiated infinitely many times). The function is incrementally updated from the data stream. The monotonicity and smoothness of the function ensure that both the monotone property and regularity requirement of the quantile estimates are satisfied. The experimental results show that the method performs very well and estimates multiple quantiles more precisely than the original DUMIQE (Yazidi and Hammer 2017), and the approaches reported in Hammer and Yazidi (2017) and Cao et al. (2009).

Keywords

Dynamically changing data stream Incremental estimator Multiple quantiles Smooth quantile estimates 

Notes

References

  1. 1.
    Arandjelovic O, Pham D-S, Venkatesh S (2015) Two maximum entropy-based algorithms for running quantile estimation in nonstationary data streams. IEEE Trans Circuits Syst Video Technol 25(9):1469–1479CrossRefGoogle Scholar
  2. 2.
    Cao J, Li L, Chen A, Bu T (2010) Tracking quantiles of network data streams with dynamic operations. In: 2010 Proceedings of IEEE INFOCOM. IEEE, pp 1–5Google Scholar
  3. 3.
    Cao J, Li LE, Chen A, Bu T (2009) Incremental tracking of multiple quantiles for network monitoring in cellular networks. In: Proceedings of the 1st ACM workshop on Mobile internet through cellular networks. ACM, pp 7–12Google Scholar
  4. 4.
    Chambers JM, James DA, Lambert D, Wiel SV (2006) Monitoring networked applications with incremental quantile estimation. Stat Sci 21:463–475MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Chen F, Lambert D, Pinheiro JC (2000) Incremental quantile estimation for massive tracking. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 516–522Google Scholar
  6. 6.
    Hammer HL, Yazidi A (2017) Incremental quantiles estimators for tracking multiple quantiles. In: Proceedings of the 30th international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), France. Springer, pp 202–210Google Scholar
  7. 7.
    Jain R, Chlamtac I (1985) The P2 algorithm for dynamic calculation of quantiles and histograms without storing observations. Commun ACM 28(10):1076–1085CrossRefGoogle Scholar
  8. 8.
    Luo G, Wang L, Yi K, Graham C (2016) Quantiles over data streams: experimental comparisons, new analyses, and further improvements. VLDB J 25(4):449–472CrossRefGoogle Scholar
  9. 9.
    Ma Q, Muthukrishnan S, Sandler M (2014) Frugal streaming for estimating quantiles: One (or two) memory suffices. arXiv preprint arXiv:1407.1121
  10. 10.
    McDermott JP, Babu GJ, Liechty JC, Lin DKJ (2007) Data skeletons: simultaneous estimation of multiple quantiles for massive streaming datasets with applications to density estimation. Stat Comput 17(4):311–321MathSciNetCrossRefGoogle Scholar
  11. 11.
    Schmeiser BW, Deutsch SJ (1977) Quantile estimation from grouped data: the cell midpoint. Commun Stat Simul Comput 6(3):221–234CrossRefzbMATHGoogle Scholar
  12. 12.
    Shrivastava N, Buragohain C, Agrawal D, Suri S (2004) Medians and beyond: new aggregation techniques for sensor networks. In: Proceedings of the 2nd international conference on Embedded networked sensor systems. ACM, pp 239–249Google Scholar
  13. 13.
    Tierney L (1983) A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM J Sci Stat Comput 4(4):706–711MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Tschumitschew K, Klawonn F (2010) Incremental quantile estimation. Evol Syst 1(4):253–264CrossRefGoogle Scholar
  15. 15.
    Yazidi A, Hammer H (2017) Multiplicative update methods for incremental quantile estimation. IEEE Trans Cybernet 49(3):746–756CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceOslo Metropolitan UniversityOsloNorway

Personalised recommendations