Abstract
Assume that there are multiple data streams (channels, sensors) and in each stream the process of interest produces generally dependent and non-identically distributed observations. When the process is in a normal mode (in-control), the (pre-change) distribution is known, but when the process becomes abnormal there is a parametric uncertainty, i.e., the post-change (out-of-control) distribution is known only partially up to a parameter. Both the change point and the post-change parameter are unknown. Moreover, the change affects an unknown subset of streams, so that the number of affected streams and their location are unknown in advance. A good changepoint detection procedure should detect the change as soon as possible after its occurrence while controlling for a risk of false alarms. We consider a Bayesian setup with a given prior distribution of the change point and propose two sequential mixture-based change detection rules, one mixes a Shiryaev-type statistic over both the unknown subset of affected streams and the unknown post-change parameter and another mixes a Shiryaev–Roberts-type statistic. These rules generalize the mixture detection procedures studied by Tartakovsky (IEEE Trans Inf Theory 65(3):1413–1429, 2019) in a single-stream case. We provide sufficient conditions under which the proposed multistream change detection procedures are first-order asymptotically optimal with respect to moments of the delay to detection as the probability of false alarm approaches zero.
Similar content being viewed by others
References
Bakut PA, Bolshakov IA, Gerasimov BM, Kuriksha AA, Repin VG, Tartakovsky GP, Shirokov VV (1963) Stat Radar Theory. In: Tartakovsky GP (ed). in Russian, vol 1. Sovetskoe Radio, Moscow
Chan HP (2017) Optimal sequential detection in multi-stream data. Ann Stat 45(6):2736–2763
Fellouris G, Sokolov G (2016) Second-order asymptotic optimality in multichannel sequential detection. IEEE Trans Inf Theory 62(6):3662–3675. https://doi.org/10.1109/TIT.2016.2549042
Fellouris G, Tartakovsky AG (2017) Multichannel sequential detection—Part I: non-i.i.d. data. IEEE Trans Inf Theory 63(7):4551–4571. https://doi.org/10.1109/TIT.2017.2689785
Lai TL (1995) Sequential changepoint detection in quality control and dynamical systems (with discussion). J R Stat Soc Ser B Methodol 57(4):613–658
Lai TL (1998) Information bounds and quick detection of parameter changes in stochastic systems. IEEE Trans Inf Theory 44(7):2917–2929
Mei Y (2010) Efficient scalable schemes for monitoring a large number of data streams. Biometrika 97(2):419–433
Pollak M, Tartakovsky AG (2009) Optimality properties of the Shiryaev–Roberts procedure. Stat Sin 19(4):1729–1739
Polunchenko AS, Tartakovsky AG (2010) On optimality of the Shiryaev–Roberts procedure for detecting a change in distribution. Ann Stat 38(6):3445–3457
Tartakovsky AG (2005) Asymptotic performance of a multichart CUSUM test under false alarm probability constraint. In: Proceedings of the 44th IEEE conference decision and control and European control conference (CDC-ECC’05), Seville, SP, IEEE, Omnipress CD-ROM, pp 320–325
Tartakovsky AG (2017) On asymptotic optimality in sequential changepoint detection: non-iid case. IEEE Trans Inf Theory 63(6):3433–3450. https://doi.org/10.1109/TIT.2017.2683496
Tartakovsky AG (2019) Asymptotic optimality of mixture rules for detecting changes in general stochastic models. IEEE Trans Inf Theory 65(3):1413–1429. https://doi.org/10.1109/TIT.2018.287686
Tartakovsky AG, Brown J (2008) Adaptive spatial-temporal filtering methods for clutter removal and target tracking. IEEE Trans Aerosp Electron Syst 44(4):1522–1537
Tartakovsky AG, Veeravalli VV (2005) General asymptotic Bayesian theory of quickest change detection. Theory of Probability and its Applications 49(3):458–497
Tartakovsky AG, Rozovskii BL, Blaźek R B, Kim H (2006) Detection of intrusions in information systems by sequential change-point methods. Stat Methodology 3(3):252–293
Tartakovsky AG, Pollak M, Polunchenko AS (2012) Third-order asymptotic optimality of the generalized Shiryaev–Roberts changepoint detection procedures. Theory of Probability and its Applications 56 (3):457–484. https://doi.org/10.1137/S0040585X97985534
Tartakovsky AG, Nikiforov IV, Basseville M (2014) Sequential analysis: hypothesis testing and changepoint detection. Monographs on Statistics and Applied Probability, Chapman & Hall/CRC Press Boca Raton, London, New York
Willsky AS, Jones HL (1976) A generalized likelihood ratio approach to the detection and estimation of jumps in linear systems. IEEE Trans Autom Control 21 (1):108–112
Xie Y, Siegmund D (2013) Sequential multi-sensor change-point detection. Ann Stat 41(2):670–692
Acknowledgements
The work was supported in part by the Russian Ministry of Education and Science 5-100 excellence project, the Russian Federation Ministry of Education and Science Arctic program and the grant 18-19-00452 from the Russian Science Foundation at the Moscow Institute of Physics and Technology.
The author would like to thank two referees for useful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by the 5-100 program and Arctic program from the Russian Federation Ministry of Science and Education and by grant 18-19-00452 from the Russian Science Foundation at the Moscow Institute of Physics and Technology
Appendix: An Auxiliary Lemma and Proofs
Appendix: An Auxiliary Lemma and Proofs
The following lemma is extensively used for obtaining upper bounds for the moments of the detection delay, which are needed for proving asymptotic optimality properties of the introduced detection procedures. In this lemma, P is a generic probability measure and E is a corresponding expectation.
Lemma A.1
Letτ(τ = 0,1,…)be a non-negative integer-valued random variable and let N (N ≥ 1)be an integer number. Then, for anyr ≥ 1,
Proof
□
1.1 Proof of Proposition 1
To prove asymptotic approximations (5.3) and (5.4) note first that by Eq. 5.1 the detection procedure \({T_{A}^{W}}\) belongs to class \(\mathbb {C}(1/(A+1))\), so replacing α by 1/(A + 1) in the asymptotic lower bounds (4.3) and (4.4), we obtain that under the right-tail condition C1 the following asymptotic lower bounds hold for all r > 0, \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ:
Therefore, to prove the assertions of the proposition it suffices to show that, under the left-tail condition C2, for all 0 < m ≤ r, \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ
The proof of part (i). Let \(\pi ^{A}=\{{\pi _{k}^{A}}\}\), \({\pi _{k}^{A}}=\pi _{k}^{\alpha }\) for α = αA = 1/(1 + A), and define
Obviously, for any n ≥ 1,
where Γδ,𝜃 = {𝜗 ∈Θ : |𝜗 − 𝜃| < δ}, so that for any \(\mathcal {B}\in \mathcal {P}\), 𝜃 ∈Θ, \(k\in \mathbb {Z}_{+}\)
It is easy to see that for n ≥ NA the last probability does not exceed the probability
Since, by condition CP1, \(N_{A}^{-1} |\log {\Pi }_{k-1+N_{A}}^{A}| \to \mu \) as A →∞, for a sufficiently large value of A there exists a small κ = κA (κA → 0 as A →∞) such that
Hence, for all sufficiently large A and n such that \(\frac {1}{n}[|\log p_{\mathcal {B}}| + |\log W({\Gamma }_{\delta ,\theta })|] \le \varepsilon /2\), we have
By Lemma A.1, for any \(k\in \mathbb {Z}_{+}\), \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ we have the following inequality
which along with (A.7) yields
Now, note that
and hence,
Recall that we set \({{\Pi }_{k}^{A}}={\Pi }^{\alpha }_{k}\) with α = αA = 1/(1 + A). It follows from Eqs. A.9 and A.10 that
Since by condition C2, \({\Upsilon }_{r}(\varepsilon /2,\mathcal {B}, \theta ) < \infty \) for all \(\mathcal {B}\in \mathcal {P}\), 𝜃 ∈Θ, and ε > 0 and, by condition CP3, \((A{\Pi }_{k-1}^{A})^{-1} \to 0\), \(|\log {\pi _{k}^{A}}|/\log A \to 0\) as A →∞, inequality (A.11) implies the asymptotic inequality
Since ε can be arbitrarily small, this implies the asymptotic upper bound (A.4) (for all 0 < m ≤ r, \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ). This upper bound and the lower bound (A.2) prove the asymptotic relation (5.3). The proof of (i) is complete.
The proof of part (ii). Using the inequalities (A.11) and \(1-\mathsf {PFA}_{\pi }({T_{A}^{W}}) \ge A/(1+A)\), we obtain that for any \(0<\varepsilon < I_{\mathcal {B},\theta }+\mu \)
By condition C2, \({\Upsilon }_{r}(\varepsilon /2,\mathcal {B},\theta ) < \infty \) for any ε > 0, \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ and, by condition CP2, \({\sum }_{k=0}^{\infty } {\pi _{k}^{A}} |{\log \pi _{k}^{A}}|^{r} =o(|\log A|^{r})\) as A →∞, which implies that for all \(\mathcal {B}\in \mathcal {P}\) and 𝜃 ∈Θ
Since ε can be arbitrarily small, the asymptotic upper bound (A.5) follows and the proof of the asymptotic approximation (5.4) is complete.
1.2 Proof of Proposition 2
As before, \({\pi _{k}^{A}}=\pi _{k}^{\alpha _{A}}\), so \(\bar \nu _{A}=\bar \nu _{\alpha _{A}}\) and \(\omega _{A}=\omega _{\alpha _{A}}\), where |log αA|∼ log A.
For ε ∈ (0,1), let
Recall that
(see Eq. 5.7), so using Chebyshev’s inequality, we obtain
Analogously to Eq. 4.9,
Since
we have
By condition (5.9),
which implies that ωA = o(Aγ) as A →∞ for any γ > 0. Therefore, \(U_{M_{A},k}({\widetilde {T}_{A}^{W}})\to 0\) as A →∞ for any fixed k. Also, \(\beta _{M_{A},k}(\varepsilon ,\mathcal {B},\theta )\to 0\) by condition C1, so that \(\mathsf {P}_{k, \mathcal {B},\theta }({0 < {\widetilde {T}_{A}^{W}} -k < M_{A}})\to 0\) for any fixed k. It follows from Eq. A.13 that for an arbitrary ε ∈ (0,1) as A →∞
which yields the asymptotic lower bound (for any fixed \(k\in \mathbb {Z}_{+}\), \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ)
To prove (5.10) it suffices to show that this bound is attained by \({\widetilde {T}_{A}^{W}}\), i.e.,
Define
By Lemma A.1, for any \(k\in \mathbb {Z}_{+}\), \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ,
and since for any n ≥ 1,
in just the same way as in the proof of Proposition 1 (setting \({\pi _{k}^{A}}=1\)) we obtain that for all \(n \ge \widetilde {M}_{A}\)
Hence, for all sufficiently large n such that \(\frac {1}{n}\left [|\log W({\Gamma }_{\delta ,\theta })| +|\log p_{\mathcal {B}}|\right ] \le \varepsilon /2\),
Using Eqs. A.19 and A.20, we obtain
which along with the inequality \(\mathsf {P}_{\infty }({\widetilde {T}_{A}^{W}} > k) > 1- (\omega _{A} +k)/A\) (see Eq. 5.7) implies the inequality
Since due to Eq. A.16ωA/A → 0 and, by condition C2, \({\Upsilon }_{r}(\varepsilon /2,\mathcal {B}, \theta ) < \infty \) for all ε > 0, \(\mathcal {B}\in \mathcal {P}\), 𝜃 ∈Θ, inequality (A.22) implies the asymptotic inequality
Since ε can be arbitrarily small the asymptotic upper bound (A.18) follows and the proof of the asymptotic approximation (5.10) is complete.
In order to prove (5.11) note first that, using Eq. A.13, yields the lower bound
Let KA be an integer number that approaches infinity as A →∞ with rate O(Aγ), γ > 0. Now, using Eqs. A.14 and A.15, we obtain
Note that due to Eq. A.16\((\omega _{A}+\bar \nu _{A})/A^{\gamma } \to 0\) as A →∞ for any γ > 0. As a result, the first two terms in Eq. A.24 go to zero as A →∞ (by Markov’s inequality \(\mathsf {P}(\nu >K_{A}) \le \bar \nu _{A}/K_{A} = \bar \nu _{A}/O(A^{\gamma }) \to 0\)) and the last term also goes to zero by condition C1 and Lebesgue’s dominated convergence theorem. Thus, for all 0 < ε < 1, \(\mathsf {P}^{\pi }_{\mathcal {B}}(0< {\widetilde {T}_{A}^{W}} -\nu < M_{A})\) approaches 0 as A →∞. Using inequality (A.23), we obtain that for any 0 < ε < 1 as A →∞
which yields the asymptotic lower bound (for any r > 0, \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ)
To obtain the upper bound it suffices to use inequality (A.21), which along with the fact that \(\mathsf {PFA}_{\pi }({\widetilde {T}_{A}^{W}}) \le (\bar \nu _{A} +\omega _{A})/A\) yields (for every \(0<\varepsilon < I_{\mathcal {B},\theta }\))
Since \((\omega _{A}+\bar \nu _{A})/A\to 0\) and, by condition C2, \({\Upsilon }_{r}(\varepsilon /2,\mathcal {B},\theta ) < \infty \) for any ε > 0, \(\mathcal {B}\in \mathcal {P}\), and 𝜃 ∈Θ we obtain that, for every \(0<\varepsilon < I_{\mathcal {B},\theta }\) as A →∞,
which implies
since ε can be arbitrarily small.
Applying the bounds (A.25) and (A.26) together completes the proof of Eq. 5.11.
Rights and permissions
About this article
Cite this article
Tartakovsky, A.G. Asymptotically Optimal Quickest Change Detection in Multistream Data—Part 1: General Stochastic Models. Methodol Comput Appl Probab 21, 1303–1336 (2019). https://doi.org/10.1007/s11009-019-09735-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11009-019-09735-3
Keywords
- Asymptotic optimality
- Changepoint detection
- General non-i.i.d. models
- Hidden Markov models
- Moments of the delay to detection
- r-Complete convergence
- Statistical process control
- Surveillance