Skip to main content
Log in

Detection and Analysis of Spikes in a Random Sequence

  • Published:
Methodology and Computing in Applied Probability Aims and scope Submit manuscript

Abstract

Motivated by the more frequent natural and anthropogenic hazards, we revisit the problem of assessing whether an apparent temporal clustering in a sequence of randomly occurring events is a genuine surprise and should call for an examination. We study the problem in both discrete and continuous time formulation. In the discrete formulation, the problem reduces to deriving the probability that p independent people all have birthdays within d days of each other. We provide an analytical expression for a warning limit such that if a subset of p people among n are observed to have birthdays within d days of each other and d is smaller than our warning limit, then it should be treated as a surprising cluster. In the continuous time framework, three different sets of results are given. First, we provide an asymptotic analysis of the problem by embedding it into an extreme value problem for high order spacings of iid samples from the U[0, 1] density. Second, a novel analytical nonasymptotic bound is derived by using certain tools of empirical process theory. Finally, the required probability is approximated by using various bounds and asymptotic results on the supremum of the scanning process of a one dimensional stationary Poisson process. We apply the theories to climate change related datasets, datasets on temperatures, and mass shooting records in the United States. These real data applications of our theoretical methods lead to supporting evidence for climate change and recent spikes in gun violence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abramson M, Moser W (1970) More birthday surprises. Amer Math Monthly 7:856–858

    Article  MathSciNet  Google Scholar 

  • Alm S (1999) Approximations of distributions of scan statistics of poisson processes. In: Glaz J, Balakrishnan N (eds) Scan Statistics and Applications. Birkhäuser, Berlin, pp 113–140

  • Cressie N (1977) The minimum of higher order gaps. Austr J Statist 19:132–143

    Article  MathSciNet  Google Scholar 

  • DasGupta A (2008) Asymptotic theory of statistics and probability. Springer, New York

    MATH  Google Scholar 

  • Dembo A, Karlin S (1992) Poisson approximations for \(r\)-scan processes. Ann Appl Prob 2:329–357

    Article  MathSciNet  Google Scholar 

  • Diaconis P, Mosteller F (1989) Methods for studying coincidences. J Amer Statist Assoc 8:853–861

    Article  MathSciNet  Google Scholar 

  • Giné E, Zinn J (1984) Some limit theorems for empirical processes. Ann Prob 12:929–989

    Article  MathSciNet  Google Scholar 

  • Glaz J, Naus J (1991) Tight bounds for scan statistics probabilities for discrete data. Ann Appl Probab 1:306–318

    Article  MathSciNet  Google Scholar 

  • Glaz J, Naus J, Walllenstein S (2001) Scan statistics. Springer, New York

    Book  Google Scholar 

  • Haiman G (2000) Estimating the distribution of scan statistics with high precision. Extremes 3:348–361

    Article  MathSciNet  Google Scholar 

  • Haiman G (2007) Estimating the distribution of one-dimensional discrete scan statistics viewed as extremes of 1-dependent stationary sequence. Jour Stat Plan Infer 137:821–828

    Article  MathSciNet  Google Scholar 

  • Janson S (1984) Bounds on the distributions of extremal values in a scanning process. Stoch Proc Appls 18:313–328

    Article  Google Scholar 

  • Klamkin M, Newman DJ (1967) Extensions of the birthday surprise. J Combin Theory 3:279–282

    Article  MathSciNet  Google Scholar 

  • Krauth J (1992) Bounds for the upper-tail probabilities of the circular ratchet scan statistics. Biometrics 48:1177–1185

    Article  MathSciNet  Google Scholar 

  • Lagrange R (1963) Sur les combinaisons d’objets numérotes. Bull Sci Math 87:29–42

    MathSciNet  MATH  Google Scholar 

  • Loader C (1991) Large-deviation approximations to the distribution of scan statistics. Adv Appl Prob 23:751–771

    Article  MathSciNet  Google Scholar 

  • Naus JI (1968) An extension of the birthday problem. Am Stat 22:27–29

    Google Scholar 

  • Naus JI (1982) Approximations for distribution of scan statistics. J Am Stat Assoc 77:177–183

    Article  MathSciNet  Google Scholar 

  • Newell G (1963) Distribution between the smallest distance for any pair of \(k\)-th nearest neighbor random points on a line. In: Rosenblatt M (ed) Time Series Analysis, Proceedings of a Conference held at Brown University. Academic Press, NY

  • Pachauri RK et al (2014). In: Pachauri R, Meyer L (eds) Climate Change 2014: Synthesis Report. Contribution of Working Groups I, II and III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. IPCC, Geneva

  • Robbins MW, Lund RB, Gallagher CM, Lu Q (2011) Changepoints in the north atlantic tropical cyclone record. J Am Stat Assoc 106:89–99

    Article  MathSciNet  Google Scholar 

  • Tu I (1997) Theory and Application of Scan Statistics PhD Dissertation. Department of Statistics, Stanford University

  • Wallenstein S, Weinberg CR, Gould M (1989) Testing for a pulse in seasonal event data. Biometrics 45:817–830

    Article  Google Scholar 

  • Wallenstein S, Neff N (1987) An approximation for the distribution of the scan statistic. Stat Med 6:197–207

    Article  Google Scholar 

  • Watson G (1954) Extreme values in samples from \(m\)-dependent stationary stochastic processes. Ann Math Statist 25:798–800

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We are greatly indebted to Joe Glaz and Christian Robert for carefully reading earlier drafts of this manuscript and for contributing to the development of the results. Comments from two anonymous reviewers very greatly improved this paper and we are much indebted to the reviewers. We acknowledge that Li’s research is partially supported by NSF grants DPP-1418339 and AGS-1602845 and NASA-NNX14A080G, and DasGupta’s research is partially supported by grant 206057 from Elsevier Global Analytics.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anirban Dasgupta.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dasgupta, A., Li, B. Detection and Analysis of Spikes in a Random Sequence. Methodol Comput Appl Probab 20, 1429–1451 (2018). https://doi.org/10.1007/s11009-018-9637-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11009-018-9637-0

Keywords

Mathematics Subject Classification (2010)

Navigation