Tracking of multiple quantiles in dynamically varying data streams


In this paper, we consider the problem of tracking multiple quantiles of dynamically varying data stream distributions. The method is based on making incremental updates of the quantile estimates every time a new sample is received. The method is memory and computationally efficient since it only stores one value for each quantile estimate and only performs one operation per quantile estimate when a new sample is received from the data stream. The estimates are realistic in the sense that the monotone property of quantiles is satisfied in every iteration. Experiments show that the method efficiently tracks multiple quantiles and outperforms state-of-the-art methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    Abbasi B, Guillen M (2013) Bootstrap control charts in monitoring value at risk in insurance. Expert Syst Appl 40(15):6125–6135

    Article  Google Scholar 

  2. 2.

    Alkhamees N, Fasli M (2016) Event detection from social network streams using frequent pattern mining with dynamic support values. In: 2016 IEEE international conference on big data (big data). IEEE, pp 1670–1679

  3. 3.

    Arasu A, Manku GS (2004) Approximate counts and quantiles over sliding windows. In: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. ACM, pp 286–296

  4. 4.

    Cao J, Li L, Chen A, Bu T (2010) Tracking quantiles of network data streams with dynamic operations. In: INFOCOM, 2010 proceedings IEEE. IEEE, pp 1–5

  5. 5.

    Cao J, Li LE, Chen A, Bu T (2009) Incremental tracking of multiple quantiles for network monitoring in cellular networks. In: Proceedings of the 1st ACM workshop on mobile internet through cellular networks. ACM, pp 7–12

  6. 6.

    Chen F, Lambert D, Pinheiro JC (2000) Incremental quantile estimation for massive tracking. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 516–522

  7. 7.

    Choi B-Y, Moon S, Cruz R, Zhang Z-L, Diot C (2007) Quantile sampling for practical delay monitoring in internet backbone networks. Comput Netw 51(10):2701–2716

    Article  Google Scholar 

  8. 8.

    Datar M, Gionis A, Indyk P, Motwani R (2002) Maintaining stream statistics over sliding windows. SIAM J Comput 31(6):1794–1813

    MathSciNet  Article  Google Scholar 

  9. 9.

    Fang Y, Zhang H, Ye Y, Li X (2014) Detecting hot topics from twitter: a multiview approach. J Inf Sci 40(5):578–593

    Article  Google Scholar 

  10. 10.

    Gilbert AC, Guha S, Indyk P, Kotidis Y, Muthukrishnan S, Strauss MJ (2002) Fast, small-space algorithms for approximate histogram maintenance. In: Proceedings of the thiry-fourth annual ACM symposium on theory of computing. ACM, pp 389–398

  11. 11.

    Gilbert AC, Kotidis Y, Muthukrishnan S, Strauss MJ (2002) How to summarize the universe: dynamic maintenance of quantiles. In: Proceedings of the 28th Kotidisinternational conference on very large data bases. VLDB Endowment, pp 454–465

  12. 12.

    Gilli M et al (2006) An application of extreme value theory for measuring financial risk. Comput Econ 27(2–3):207–228

    Article  Google Scholar 

  13. 13.

    Hammer HL, Yazidi A (2017) Incremental quantiles estimators for tracking multiple quantiles. In: International conference on industrial, engineering and other applications of applied intelligent systems. Springer, pp 202–210

  14. 14.

    Hammer HL, Yazidi A (2018) A novel incremental quantile estimator using the magnitude of the observations. In: 2018 26th Mediterranean conference on control and automation (MED). IEEE, pp 290–295

  15. 15.

    Hammer HL, Yazidi A, John OB (2018) On the classification of dynamical data streams using novel anti-Bayesian techniques. Pattern Recognit 76:108–124

    Article  Google Scholar 

  16. 16.

    Hammer HL, Yazidi A, Rue H (2018) A new quantile tracking algorithm using a generalized exponentially weighted average of observations. Appl Intell.

    Article  Google Scholar 

  17. 17.

    Lin X, Lu H, Xu J, Yu JX (2004) Continuously maintaining quantile summaries of the most recent n elements over a data stream. In: Proceedings of the 20th international conference on data engineering, 2004. IEEE, pp 362–373

  18. 18.

    Luo G, Wang L, Yi K, Cormode G (2016) Quantiles over data streams: experimental comparisons, new analyses, and further improvements. VLDB J 25(4):1–24

    Article  Google Scholar 

  19. 19.

    Ma Q, Muthukrishnan S, Sandler M (2013) Frugal streaming for estimating quantiles. In: Brodnik A, López-Ortiz A, Raman V, Viola A (eds) Space-efficient data structures, streams, and algorithms. Springer, Berlin, pp 77–96

    Google Scholar 

  20. 20.

    Norman MF (1972) Markov processes and learning models, vol 84. Academic Press, New York

    Google Scholar 

  21. 21.

    Sommers J, Barford P, Duffield N, Ron A (2007) Accurate and efficient SLA compliance monitoring. In: ACM SIGCOMM computer communication review, vol 37. ACM, pp 109–120

  22. 22.

    Sommers J, Barford P, Duffield N, Ron A (2010) Multiobjective monitoring for SLA compliance. IEEE/ACM Trans Netw (TON) 18(2):652–665

    Article  Google Scholar 

  23. 23.

    Song G, Ye Y, Zhang H, Xu X, Lau RYK, Liu F (2016) Dynamic clustering forest: an ensemble framework to efficiently classify textual data stream with concept drift. Inf Sci 357:125–143

    Article  Google Scholar 

  24. 24.

    Stahl V, Fischer A, Bippus R (2000) Quantile based noise estimation for spectral subtraction and wiener filtering. In: 2000 IEEE international conference on acoustics, speech, and signal processing, 2000. ICASSP’00, vol 3. IEEE, pp 1875–1878

  25. 25.

    Tierney L (1983) A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM J Sci Stat Comput 4(4):706–711

    MathSciNet  Article  Google Scholar 

  26. 26.

    Tiwari N, Pandey PC (2018) A technique with low memory and computational requirements for dynamic tracking of quantiles. J Signal Process Syst.

    Article  Google Scholar 

  27. 27.

    Yazidi A, Hammer HL (2017) Multiplicative update methods for incremental quantile estimation. IEEE Trans Cybern.

    Article  Google Scholar 

  28. 28.

    Zhang L, Guan Y (2008) Detecting click fraud in pay-per-click streams of online advertising networks. In: The 28th international conference on distributed computing systems, 2008. ICDCS’08. IEEE, pp 77–84

  29. 29.

    Zhang X, Alexander L, Hegerl GC, Jones P, Tank AK, Peterson TC, Trewin B, Zwiers FW (2011) Indices for monitoring changes in extremes based on daily temperature and precipitation data. Wiley Interdiscip Rev Clim Change 2(6):851–870

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Hugo Lewi Hammer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

We will first present a theorem due to Norman [20] that will be used to prove Theorem 1. Norman [20] studied distance “diminishing models”. The convergence of \({\widehat{Q}}_n(q_k)\) to \(Q(q_k)\) is a consequence of this theorem.

Theorem 2

Letx(t) be a stationary Markov process dependent on a constant parameter\(\theta \in [0,1]\). Each\(x(t) \in I\), whereIis a subset of the real line. Let\(\delta x(t)=x(t+1)-x(t)\). The following are assumed to hold:

  1. 1.

    I is compact

  2. 2.

    \(E [\delta x(t) | x(t)=y]= \theta w(y)+ O(\theta ^2)\)

  3. 3.

    \(Var [\delta x(t) | x(t)=y]= \theta ^2 s(y)+ O(\theta ^2)\)

  4. 4.

    \(E [\delta x(t)^3 | x(t)=y]= O(\theta ^3)\)where\(sup_{y \in I} \frac{O(\theta ^k)}{\theta ^k}< \infty\)for\(k=2,3\)and\(sup_{y \in I} \frac{o(\theta ^2)}{\theta ^2} \rightarrow 0\) as \(\theta \rightarrow 0\)

  5. 5.

    w(y) has a Lipschitz derivative inI

  6. 6.

    s(y) is LipschitzI.

If Assumptions 1 to 6 above hold,w(y) has a unique root\(y^*\)inIand\(\frac{d w}{d y} \bigg |_{y=y^*} \le 0\)then

  1. 1.

    \(var [\delta x(t) | x(0)=x]=O(\theta )\)uniformly for all\(x \in I\)and\(t \ge 0\). For any\(x \in I\), the differential equation\(\frac{d y(\tau )}{d \tau }=w(y(t))\)has a unique solution\(y(\tau )=y(\tau ,x)\)with\(y(0)=x\)and\(E [\delta x(t) | x(0)=x]=y( t \theta )+O(\theta )\)uniformly for all\(x \in I\) and \(t \ge 0\).

  2. 2.

    \(\frac{x(t)-y(t \theta )}{\sqrt{\theta }}\)has a normal distribution with zero mean and finite variance as\(\theta \rightarrow 0\)and\(t \theta \rightarrow \infty\).

Having presented Theorem 2, we are now ready to prove Theorem 1 by resorting to Theorem 2. This is the main result of this paper. We will prove the convergence of \({\widehat{Q}}_n(q_k)\) for \(k=2,3,\ldots ,K-1\) below. The proof for \({\widehat{Q}}_n(q_1)\) and \({\widehat{Q}}_n(q_K)\) can be done in the same manner and are not shown in this paper for the sake of brevity.


We now start by showing that the Markov process based on the updating rules (9)–(11) satisfies the assumptions 1 to 6 in Theorem 2.

$$\begin{aligned}&E\left( \delta {\widehat{Q}}_{n}(q_k)\,\left| \,{\widehat{Q}}_n(q_k)\right. \right)\\&\quad = E\left( \delta {\widehat{Q}}_{n}(q_k)\,\left| \,{\widehat{Q}}_n(q_k) \ge X\right. \right) P\left( {\widehat{Q}}_n(q_k) \ge X \right) \\&\qquad +E\left( \delta {\widehat{Q}}_{n}(q_k)\,|\,{\widehat{Q}}_n(q_k)< X\right) P\left( {\widehat{Q}}_n(q_k) < X\right) \\&\quad = \beta H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) q_k {\widehat{Q}}_{n}(q_{k})\left( 1 - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right)\\&\qquad - \beta H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) (1 - q_k) {\widehat{Q}}_{n}(q_{k}) F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \\&\quad = \beta H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) {\widehat{Q}}_{n}(q_{k}) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) \end{aligned}$$

We now let \(\theta = \beta\), \(y = {\widehat{Q}}_{n}(q_{k})\) and \(w\left( {\widehat{Q}}_{n}(q_{k})\right)\) be equal to “everything” in (13) except \(\beta\). It is easy to see that assumption 2 in Theorem 2 is satisfied. Next, we turn to assumption 5 which requires that \(w\left( {\widehat{Q}}_{n}(q_{k})\right)\) has a Lipschitz derivative with respect to \({\widehat{Q}}_{n}(q_{k})\). Unfortunately it is not obvious that this is satisfied since H has a discontinuous derivative with respect to \({\widehat{Q}}_{n}(q_{k})\) due to the min-function in (6). To show that both assumptions 2 and 5 are satisfied, we need to perform a subtle modification of (13) as follows. A typical example of H as a function of \({\widehat{Q}}_{n}(q_{k})\) is shown as the black curve in Fig. 7. We define a function \(H^{*}\) that is equal to H except for an the interval \([Q^{*} - \delta , Q^{*} + \delta ]\) (see Fig. 7). In the interval \([Q^{*} - \delta , Q^{*} + \delta ]\), \(H^{*}\) is a function that is smaller then H (to satisfy the monotone property) and has a Lipschitz derivative. This requires that \(H^{*}\) satisfy the following

$$\begin{aligned}&H^{*}\left( Q^{*} - \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&\quad =H\left( Q^{*} - \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&\frac{{\mathrm{d}} H^{*}}{{\mathrm{d}} {\widehat{Q}}_{n}(q_{k})} \left( Q^{*} - \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&\quad = \frac{{\mathrm{d}} H}{{\mathrm{d}} {\widehat{Q}}_{n}(q_{k})} \left( Q^{*} - \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&H^{*}\left( Q^{*} + \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&\quad = H\left( Q^{*} + \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&\frac{{\mathrm{d}} H^{*}}{{\mathrm{d}} {\widehat{Q}}_{n}(q_{k})} \left( Q^{*} + \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \\&\quad = \frac{{\mathrm{d}} H}{{\mathrm{d}} {\widehat{Q}}_{n}(q_{k})} \left( Q^{*} + \delta ; {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \end{aligned}$$

i.e. that the function value and the derivative must be equal for H and \(H^{*}\) in \(Q^{*} - \delta\) and \(Q^{*} + \delta\). It is straightforward to satisfy these criteria, e.g. by fitting a polynomial. \(H^{*}\) is illustrated as the grey curve in Fig. 7 (and is equal to H outside the interval). By reducing the value of \(\delta\), \(H^{*}\) will be more and more similar to H. In other words, there exists always a \(\delta\) such that

$$\left| H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) - H^{*}\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) \right| < \beta \quad \forall \,{\widehat{Q}}_{n}(q_{k}) \in \left[ {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right]$$

which means that we can write

$$H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) = H^{*}\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) + O(\beta )$$

Substituting (14) into (13) we obtain

$$\beta H^{*} \left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) {\widehat{Q}}_{n}(q_{k}) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) + \beta {\widehat{Q}}_{n}(q_{k}) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) O(\beta ) = \beta H^{*} \left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) {\widehat{Q}}_{n}(q_{k}) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) + O(\beta ^2)$$

Since \(H^{*}\) has a Lipschitz derivative, we see that with

$$w\left( {\widehat{Q}}_{n}(q_{k})\right) = H^{*}\left( {\widehat{Q}}_{n}(q_{k});{\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) {\widehat{Q}}_{n}(q_{k}) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right)$$

both assumptions 2 and 5 are satisfied.

Next we turn to assumption 3.

$$\begin{aligned}&E\left( \delta {\widehat{Q}}_{n}(q_k)^2\,\left| \,{\widehat{Q}}_n(q_k)\right. \right) \\&\quad = E\left( \delta {\widehat{Q}}_{n}(q_k)^2\,\left| \,{\widehat{Q}}_n(q_k) \ge X \right. \right) P\left( {\widehat{Q}}_n(q_k) \ge X \right) \\&\qquad + E\left( \delta {\widehat{Q}}_{n}(q_k)^2\,\left| \,{\widehat{Q}}_n(q_k)< X\right. \right) P\left( {\widehat{Q}}_n(q_k) < X\right) \\&\quad = \beta ^2 \left( H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) q_k {\widehat{Q}}_{n}(q_{k}) \right) ^2 \left( 1 - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) \\&\qquad - \beta ^2 \left( H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) (1 - q_k) {\widehat{Q}}_{n}(q_{k}) \right) ^2 F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \end{aligned}$$
$$\begin{aligned}&{\mathrm{Var}}\left( \delta {\widehat{Q}}_{n}(q_k)\,\left| \,{\widehat{Q}}_n(q_k)\right. \right) = E\left( \delta {\widehat{Q}}_{n}(q_k)^2\,\left| \,{\widehat{Q}}_n(q_k)\right. \right) - E\left( \delta {\widehat{Q}}_{n}(q_k)\,\left| \,{\widehat{Q}}_n(q_k)\right. \right) ^2\\ \nonumber&\quad = \beta ^2 \left( H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) q_k {\widehat{Q}}_{n}(q_{k}) \right) ^2 \left( 1 - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) \\ \nonumber&\qquad - \beta ^2 \left( H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) (1 - q_k) {\widehat{Q}}_{n}(q_{k}) \right) ^2 F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \\ \nonumber&\qquad + \beta ^2 \left( H\left( {\widehat{Q}}_{n}(q_{k}); {\widehat{Q}}_{n}(q_{k-1}), {\widehat{Q}}_{n}(q_{k+1}) \right) {\widehat{Q}}_{n}(q_{k}) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k})\right) \right) \right) ^2 \end{aligned}$$

We see that assumption 3 is satisfied with \(s\left( {\widehat{Q}}_{n}(q_{k})\right)\) equal to everything in (16) except \(\beta ^2\). Since H is Lipschitz (has a bounded derivative), \(s\left( {\widehat{Q}}_{n}(q_{k})\right)\) is Lipschitz and assumption 6 is also satisfied. Assumption 4 can now be proved in the same manner.

We will use the results of Norman to prove the convergence. It is easy to see that \(w\left( {\widehat{Q}}_{n}(q_{k})\right)\) in (15) admits two roots \({\widehat{Q}}_n(q_k) = {F_X}^{-1}(q_k) = Q(q_k)\) and \({\widehat{Q}}_n(q_k) = 0\). By introducing an arbitrarily small lower bound \(Q_{\text {min}}>0\) on estimate \({\widehat{Q}}_{n}(q_{k})\), we can avoid the \({\widehat{Q}}_n(q_k)=0\). This is easily implemented by modifying the update rules and adding \(Q_{\text {min}}\) to the right term of Eqs. (9)–(11). Therefore, the unique root becomes \({\widehat{Q}}_n(q_k)={F_X}^{-1}(q_k) = Q(q_k)\).

Fig. 7

The black and the grey curves show the functions H and \(H^{*}\), respectively

We now differentiate to get

$$\frac{{\mathrm{d}}w\left( {\widehat{Q}}_{n}(q_{k})\right) }{{\mathrm{d}}{\widehat{Q}}_{n}(q_{k}) }= \left( \frac{{\mathrm{d}} H^{*}}{{\mathrm{d}} {\widehat{Q}}_{n}(q_{k})} + H^{*} \right) \left( q_k - F_X\left( {\widehat{Q}}_{n}(q_{k}) \right) - H^{*} {\widehat{Q}}_{n}(q_{k}) f\left( {\widehat{Q}}_{n}(q_{k})\right) \right)$$

We substitute the unique root \(Q(q_k)\) for \({\widehat{Q}}_{n}(q_{k})\) and get

$$\begin{aligned}&\frac{ {\mathrm{d}}w\left( {\widehat{Q}}_{n}(q_{k})\right) }{{\mathrm{d}} {\widehat{Q}}_{n}(q_{k})} \bigg |_{{\widehat{Q}}_{n}(q_{k})=Q(q_k)}\\&\quad =\left( \frac{{\mathrm{d}} H^{*}}{{\mathrm{d}}{\widehat{Q}}_{n}(q_{k})} + H^{*} \right) (q_k - F_X\left( Q(q_{k}) \right) - H^{*} Q(q_{k}) f\left( Q(q_{k})\right) ) \\&\quad = 0-H^{*} Q(q_k) f_X(Q(q_k))<0. \end{aligned}$$

This gives

$$\lim _{n \beta \rightarrow \infty , \beta \rightarrow 0} E\left( {\widehat{Q}}_n(q_k)\right) =Q(q_k)+O(\beta )$$


$${\mathrm{Var}}\left( {\widehat{Q}}_n(q_k)\right) =O(\beta )$$


$$\lim _{n \beta \rightarrow \infty , \beta \rightarrow 0} {\widehat{Q}}_n(q_k)=Q(q_k)$$

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hammer, H.L., Yazidi, A. & Rue, H. Tracking of multiple quantiles in dynamically varying data streams. Pattern Anal Applic 23, 225–237 (2020).

Download citation


  • Dynamic environment
  • Incremental algorithm
  • Multiple quantiles
  • Quantile tracking